[{"data":1,"prerenderedAt":91},["ShallowReactive",2],{"glossary-en-llm-architecture":3},{"id":4,"title":5,"body":6,"description":79,"extension":80,"meta":81,"navigation":86,"path":87,"seo":88,"stem":89,"__hash__":90},"en_glossary/en/glossary/llm-architecture.md","LLM Architecture: Transformer, Attention, Parameters",{"type":7,"value":8,"toc":68},"minimark",[9,14,18,23,26,30,33,37,40,44,52,57],[10,11,13],"h2",{"id":12},"architecture-and-working-principle","Architecture and Working Principle",[15,16,17],"p",{},"To understand how modern AI \"thinks,\" we need to look at its structural components.",[19,20,22],"h3",{"id":21},"transformer","Transformer",[15,24,25],{},"The architecture that forms the foundation of modern LLMs (GPT, Claude, Llama, etc.). It solves the relationships between words using the \"attention\" mechanism, allowing for parallel processing of data.",[19,27,29],{"id":28},"attention-mechanism","Attention Mechanism",[15,31,32],{},"The structure that allows the model to decide which words are most related to each other while processing a sentence. It enables the model to understand context and nuance (e.g., understanding \"bank\" based on \"river\" or \"money\").",[19,34,36],{"id":35},"parameters","Parameters",[15,38,39],{},"The \"units of knowledge\" a model learns during training. A model's parameter count (e.g., 70B - 70 Billion) is generally an indicator of its complexity and capacity to handle intricate tasks.",[19,41,43],{"id":42},"context-window","Context Window",[15,45,46,47,51],{},"The maximum amount of data (measured in ",[48,49,50],"strong",{},"tokens",") that a model can \"keep in mind\" at one time. A larger context window allows the model to process longer documents or conversation histories.",[53,54,56],"h4",{"id":55},"relevance-to-data-analysis","Relevance to Data Analysis",[15,58,59,60,67],{},"Understanding context windows is crucial when designing ",[48,61,62],{},[63,64,66],"a",{"href":65},"/en/solutions/cloud-iot-data-collection","Cloud & IoT Solutions"," that utilize AI. If you want to analyze a month's worth of log data, the model must have a context window large enough to ingest that data, or the data must be summarized first.",{"title":69,"searchDepth":70,"depth":70,"links":71},"",2,[72],{"id":12,"depth":70,"text":13,"children":73},[74,76,77,78],{"id":21,"depth":75,"text":22},3,{"id":28,"depth":75,"text":29},{"id":35,"depth":75,"text":36},{"id":42,"depth":75,"text":43},"Understanding the \"brain\" of AI: How Transformers, Attention Mechanisms, and Context Windows work.","md",{"tags":82},[83,22,29,84,85],"LLM","AI Architecture","Deep Learning",true,"/en/glossary/llm-architecture",{"title":5,"description":79},"en/glossary/llm-architecture","rKOKWzaKySHun8H1dZKOCI9pGdmKJhclKoqxIB_WVDY",1778229654811]