Build a Large Language Model (From Scratch) -- Sebastian Raschka -- 2024 -- Manning Publications Co_ -- da37a83a6833048cedff22fda0dfc2b8 -- Anna’s Archive.pdf

下载本文档

阅读 147
下载 2
格式 pdf
大小 11.62 MB
约281页
2025-02-06 发布于河南
收藏
评论
点赞(0)
海报
举报

Build a Large Language Model (From Scratch) -- Sebastian Raschka -- 2024 -- Manning Publications Co_ -- da37a83a6833048cedff22fda0dfc2b8 -- Anna’s Archive.pdf_第1页

/281

BuildaLargeLanguageModel(FromScratch)1.welcome2.1_Understanding_Large_Language_Models3.2_Working_with_Text_Data4.3_Coding_Attention_Mechanisms5.4_Implementing_a_GPT_model_from_Scratch_To_Generate_Text6.5_Pretraining_on_Unlabeled_Data7.Appendix_A._Introduction_to_PyTorch8.Appendix_B._References_and_Further_Reading9.Appendix_C._Exercise_Solutions10.Appendix_D._Adding_Bells_and_Whistles_to_the_Training_LoopwelcomeThankyouforpurchasingtheMEAPeditionofBuildaLargeLanguageModel(FromScratch).Inthisbook,IinviteyoutoembarkonaneducationaljourneywithmetolearnhowtobuildLargeLanguageModels(LLMs)fromthegroundup.Together,we'lldelvedeepintotheLLMtrainingpipeline,startingfromdataloadingandculminatinginfinetuningLLMsoncustomdatasets.Formanyyears,I'vebeendeeplyimmersedintheworldofdeeplearning,codingLLMs,andhavefoundgreatjoyinexplainingcomplexconceptsthoroughly.Thisbookhasbeenalong-standingideainmymind,andI'mthrilledtofinallyhavetheopportunitytowriteitandshareitwithyou.Thoseofyoufamiliarwithmywork,especiallyfrommyblog,havelikelyseenglimpsesofmyapproachtocodingfromscratch.Thismethodhasresonatedwellwithmanyreaders,andIhopeitwillbeequallyeffectiveforyou.I'vedesignedthebooktoemphasizehands-onlearning,primarilyusingPyTorchandwithoutrelyingonpre-existinglibraries.Withthisapproach,coupledwithnumerousfiguresandillustrations,IaimtoprovideyouwithathoroughunderstandingofhowLLMswork,theirlimitations,andcustomizationmethods.Moreover,we'llexplorecommonlyusedworkflowsandparadigmsinpretrainingandfine-tuningLLMs,offeringinsightsintotheirdevelopmentandcustomization.Thebookisstructuredwithdetailedstep-by-stepintroductions,ensuringnocriticaldetailisoverlooked.Togainthemostfromthisbook,youshouldhaveabackgroundinPythonprogramming.PriorexperienceindeeplearningandafoundationalunderstandingofPyTorch,orfamiliaritywithotherdeeplearningframeworkslikeTensorFlow,willbebeneficial.IwarmlyinviteyoutoengageintheliveBookdiscussionforumforanyquestions,suggestions,orfeedbackyoumighthave.Yourcontributionsareimmenselyvaluableandappreciatedinenhancingthislearningjourney.—SebastianRaschkaInthisbookwelcome1UnderstandingLargeLanguageModels2WorkingwithTextData3CodingAttentionMechanisms4ImplementingaGPTmodelfromScratchToGenerateText5PretrainingonUnlabeledDataAppendixA.IntroductiontoPyTorchAppendixB.ReferencesandFurtherReadingAppendixC.ExerciseSolutionsAppendixD.AddingBellsandWhistlestotheTrainingLoop1UnderstandingLargeLanguageModelsThischaptercoversHigh-levelexplanationsofthefundamentalconceptsbehindlargelanguagemodels(LLMs)InsightsintothetransformerarchitecturefromwhichChatGPT-likeLLMsarederivedAplanforbuildinganLLMfromscratchLargelanguagemodels(LLMs),suchasthoseofferedinOpenAI'sChatGPT,aredeepneuralnetworkmodelsthathavebeendevelopedoverthepastfewyears.TheyusheredinaneweraforNaturalLanguageProcessing(NLP).Beforetheadventoflargelanguagemodels,traditionalmethodsexcelledatcategorizationtaskssuchasemailspamclassificationandstraightforwardpatternrecognitionthatcouldbecapturedwithhandcraftedrulesorsimplermodels.However,theytypicallyunderperformedinlanguagetasksthatdemandedcomplexunderstandingandgenerationabilities,suchasparsingdetailedinstructions,conductingcontextualanalysis,orcreatingcoherentandcontextuallyappropriateoriginaltext.Forexample,previousgenerationsoflanguagemodelscouldnotwriteanemailfromalistofkeywords—ataskthatistrivialfor...

1、当您付费下载文档后，您只拥有了使用权限，并不意味着购买了版权，文档只能用于自身使用，不得用于其他商业用途（如 [转卖]进行直接盈利或[编辑后售卖]进行间接盈利）。
2、本站所有内容均由合作方或网友上传，本站不对文档的完整性、权威性及其观点立场正确性做任何保证或承诺！文档内容仅供研究参考，付费前请自行鉴别。
3、如文档内容存在违规，或者侵犯商业秘密、侵犯著作权等，请点击“违规举报”。

碎片内容