水木
水木
发布于 2024-09-14 / 51 阅读
0

解构大语言模型:从线性回归到通用人工智能(全彩)

链接: https://pan.baidu.com/s/1B8sAv3u0vI4rdWq7QgIe1A?pwd=h8mf 提取码: h8mf

出版社: 电子工业出版社
ISBN:9787121477409
版次:1
商品编码:14596264
品牌:博文视点
包装:平装
开本:16开
出版时间:2024-05-01
用纸:胶版纸
页数:432
产品特色

编辑推荐

+从模型结构和数据基础两大角度解构大语言模型。

+详解经典模型的核心结构及实现过程,夯实基础。

+模型开发与调优,重构ChatGPT,GitHub配套代码。

+融合统计分析/机器学习/经济学等知识,全彩印刷。

+免费配套学习视频(持续更新中),加入读者群获取。




内容简介

本书从模型的结构和数据基础两个方面解构大语言模型,以便帮助读者理解和搭建类似ChatGPT的系统。在模型的结构方面,大语言模型属于深度神经网络,其设计核心是注意力机制,因此,本书涵盖了多层感知器、卷积神经网络和循环神经网络等经典模型。在模型的数据基础方面,本书既涉及模型训练的工程基础,如反向传播,又涉及数据的使用方式,如迁移学习、强化学习,以及传统的监督学习和无监督学习。此外,本书还阐述了如何从计量经济学和经典机器学习模型中汲取灵感,以提高模型的稳定性和可解释性。

本书既适用于希望深入了解大语言模型、通过人工智能技术解决实际问题的读者,也适合作为高等院校计算机及相关专业的师生参考用书。

作者简介

唐亘,数据科学家,专注于人工智能和大数据,积极参与Apache Spark、scikit-learn等开源项目,曾为华为、复旦大学等多家机构提供过百余场技术培训。曾撰写《精通数据科学:从线性回归到深度学习》一书,并担任英国最大在线出版社Packt的技术审稿人。毕业于复旦大学,获数学和计算机双学士学位,后求学于巴黎综合理工学院,获经济学和数据科学双硕士学位。




精彩书评

Bringing computers to the level of human intelligence, empowering them to handle various complex tasks, has been the dream of computer scientists. This pursuit, initiated since the birth of computers, has gradually evolved into Artificial Intelligence (AI) that captures the spotlight today. In recent years, AI has achieved remarkable milestones, garnering widespread recognition. This success lies in three pivotal elements: data, models, and computational power. It is worth noting that computational power, while vital, represents hardware advancements and does not constitute the essence of this discipline.

Data, in essence, are raw alphanumeric values, encompassing various forms like statistical data, images, and text. It serves as the vehicle for storing knowledge. Our objective goes beyond mere data acquisition; it involves the systematic and efficient processing of data to gain a profound understanding of knowledge. This intricate process is referred to as models or algorithms.

The depth and breadth of knowledge within data determine the intelligence level achievable by AI. For instance, limiting data to images representing numbers renders it impossible for AI to attain human-level capabilities. Language is renowned for encapsulating a wealth of knowledge spanning culture, thoughts, and science. The triumph of Large Language Models (LLMs), such as ChatGPT, lies specifically in the utilization of textual data, providing it with the chance to extensively interact with the vast human knowledge system.

This book defines AI as a novel intelligent entity, grounded in data as its material foundation and models as its intelligence foundation. Subsequently, it engages in an exploration of Large Language Models, ensuring the text remains succinct and focused. In comparison to analogous works, this stands out as an original advantage. Reading this book, one can not only acquire comprehensive understanding of general approaches and processes within the discipline, but also delve into the forefront of Large Language Models. Furt