BuildaLargeLanguageModel(FromScratch)1.welcome2.1_Understanding_Large_Language_Models3.2_Working_with_Text_Data4.3_Coding_Attention_Mechanisms5.4_Implementing_a_GPT_model_from_Scratch_To_Generate_Text6.5_Pretraining_on_Unlabeled_Data7.Appendix_A._Introduction_to_PyTorch8.Appendix_B._References_and_Further_Reading9.Appendix_C._Exercise_Solutions10.Appendix_D._Adding_Bells_and_Whistles_to_the_Training_LoopwelcomeThankyouforpurchasingtheMEAPeditionofBuildaLargeLanguageModel(FromScratch).Inthisbook,IinviteyoutoembarkonaneducationaljourneywithmetolearnhowtobuildLargeLanguageModels(LLMs)fromthegroundup.Together,we'lldelvedeepintotheLLMtrainingpipeline,startingfromdataloadingandculminatinginfinetuningLLMsoncustomdatasets.Formanyyears,I'vebeendeeplyimmersedintheworldofdeeplearning,codingLLMs,andhavefoundgreatjoyinexplainingcomplexconceptsthoroughly.Thisbookhasbeenalong-standingideainmymind,andI'mthrilledtofinallyhavetheopportunitytowriteitandshareitwithyou.Thoseofyoufamiliarwithmywork,especiallyfrommyblog,havelikelyseenglimpsesofmyapproachtocodingfromscratch.Thismethodhasresonatedwellwithmanyreaders,andIhopeitwillbeequallyeffectiveforyou.I'vedesignedthebooktoemphasizehands-onlearning,primarilyusingPyTorchandwithoutrelyingonpre-existinglibraries.Withthisapproach,coupledwithnumerousfiguresandillustrations,IaimtoprovideyouwithathoroughunderstandingofhowLLMswork,theirlimitations,andcustomizationmethods.Moreover,we'llexplorecommonlyusedworkflowsandparadigmsinpretrainingandfine-tuningLLMs,offeringinsightsintotheirdevelopmentandcustomization.Thebookisstructuredwithdetailedstep-by-stepintroductions,ensuringnocriticaldetailisoverlooked.Togainthemostfromthisbook,youshouldhaveabackgroundinPythonprogramming.PriorexperienceindeeplearningandafoundationalunderstandingofPyTorch,orfamiliaritywithotherdeeplearningframeworkslikeTensorFlow,willbebeneficial.IwarmlyinviteyoutoengageintheliveBookdiscussionforumforanyquestions,suggestions,orfeedbackyoumighthave.Yourcontributionsareimmenselyvaluableandappreciatedinenhancingthislearningjourney.—SebastianRaschkaInthisbookwelcome1UnderstandingLargeLanguageModels2WorkingwithTextData3CodingAttentionMechanisms4ImplementingaGPTmodelfromScratchToGenerateText5PretrainingonUnlabeledDataAppendixA.IntroductiontoPyTorchAppendixB.ReferencesandFurtherReadingAppendixC.ExerciseSolutionsAppendixD.AddingBellsandWhistlestotheTrainingLoop1UnderstandingLargeLanguageModelsThischaptercoversHigh-levelexplanationsofthefundamentalconceptsbehindlargelanguagemodels(LLMs)InsightsintothetransformerarchitecturefromwhichChatGPT-likeLLMsarederivedAplanforbuildinganLLMfromscratchLargelanguagemodels(LLMs),suchasthoseofferedinOpenAI'sChatGPT,aredeepneuralnetworkmodelsthathavebeendevelopedoverthepastfewyears.TheyusheredinaneweraforNaturalLanguageProcessing(NLP).Beforetheadventoflargelanguagemodels,traditionalmethodsexcelledatcategorizationtaskssuchasemailspamclassificationandstraightforwardpatternrecognitionthatcouldbecapturedwithhandcraftedrulesorsimplermodels.However,theytypicallyunderperformedinlanguagetasksthatdemandedcomplexunderstandingandgenerationabilities,suchasparsingdetailedinstructions,conductingcontextualanalysis,orcreatingcoherentandcontextuallyappropriateoriginaltext.Forexample,previousgenerationsoflanguagemodelscouldnotwriteanemailfromalistofkeywords—ataskthatistrivialfor...
发表评论取消回复