AI Flashcards
Bite-sized nuggets of AI knowledge, news, and tips. Mark cards as you read them and track your progress!
Your Progress
0 of 25 cards read
0% complete
What is RAG?
Retrieval-Augmented Generation (RAG) combines a language model with a knowledge retriever. It fetches relevant documents before generating responses, making outputs more factual and context-aware.
How Do Transformers Work?
Transformers use self-attention mechanisms to process entire sequences of data at once, allowing models to understand relationships between words regardless of their distance in a sentence.
GPT vs BERT
BERT is a bidirectional encoder used mainly for understanding text, while GPT is a decoder designed for generating text. BERT reads context both ways; GPT predicts one word at a time.
What Is Fine-Tuning?
Fine-tuning adapts a pre-trained model to a specific task or domain by continuing training on specialized data, improving performance without starting from scratch.
What Are Embeddings?
Embeddings represent words, sentences, or documents as dense numerical vectors that capture semantic meaning — enabling similarity search and reasoning in AI systems.
How Does Tokenization Work?
Tokenization splits text into smaller units called tokens (like words or subwords) so that models can process and learn from language efficiently.
Parameters vs Tokens
Tokens are input or output chunks of text, while parameters are the learned weights inside a model. For example, GPT-4 has trillions of parameters and processes thousands of tokens per prompt.
What Is Hallucination?
A hallucination occurs when an AI model confidently generates incorrect or fabricated information that sounds plausible but is not factual.
What Does Temperature Control?
The temperature parameter controls randomness in generation. Low values make answers more focused; high values make them more creative or varied.
What Is a Context Window?
The context window is the maximum number of tokens an LLM can consider at once. Larger windows enable longer conversations and deeper reasoning.
What Is Prompt Engineering?
Prompt engineering is the practice of crafting inputs that guide an LLM toward desired behavior, improving accuracy, tone, or reasoning quality.
What Is Chain-of-Thought?
Chain-of-thought prompting encourages models to explain intermediate reasoning steps, improving logical accuracy in complex tasks.
RLHF vs SFT
Supervised Fine-Tuning (SFT) teaches a model via labeled examples, while Reinforcement Learning from Human Feedback (RLHF) refines it using human preferences for better alignment.
What Is Agentic AI?
Agentic AI refers to systems that can plan, act, and iterate autonomously toward goals — combining LLMs with tools, memory, and feedback loops.
What Is a Vector Database?
A vector database stores embeddings as high-dimensional vectors and enables fast similarity searches — essential for RAG and recommendation systems.
How RLHF Improves AI
RLHF (Reinforcement Learning from Human Feedback) aligns AI outputs with human preferences, reducing harmful or nonsensical responses.
What Is Multimodal AI?
Multimodal AI can process and generate multiple data types — like text, images, audio, and video — within a single model, enabling richer interactions.
How Do LLMs Learn?
Large Language Models learn by predicting the next token in billions of text samples, gradually capturing patterns, facts, and reasoning abilities from data.
What Is Context Compression?
Context compression summarizes or encodes long histories into compact representations, helping models recall relevant information without exceeding token limits.
What Are Synthetic Datasets?
Synthetic datasets are AI-generated data used to train or fine-tune models when real data is scarce or sensitive, improving diversity and privacy.
What Is a Knowledge Graph?
A knowledge graph connects entities and relationships in structured form, enabling reasoning and retrieval across complex data networks.
How Does Memory Work in AI Agents?
Memory allows AI agents to store past interactions or retrieved facts, enabling continuity, learning over time, and goal-oriented behavior.
What Is Embedding Search?
Embedding search finds semantically similar items by comparing vector distances — enabling contextual search and retrieval in RAG pipelines.
What Is LLM Evaluation?
LLM evaluation measures factual accuracy, reasoning, faithfulness, and style using metrics like BLEU, ROUGE, or human preference scoring.
Future of Open-Weight Models
Open-weight models like LLaMA, Mistral, and Falcon are driving transparency and innovation, letting developers fine-tune and self-host powerful AI locally.