AI, Cloud and GPU Glossary for Beginners

This glossary explains the most common words, abbreviations, tools and technical terms used across GPUJet. It is written for beginners who are learning about AI agents, cloud hosting, GPU cloud, APIs, VPS setup, automation, cost planning and safer AI workflows.

You do not need to memorize every term at once. Use this page as a reference while reading GPUJet tutorials, infrastructure guides, pricing pages and automation checklists.

AI and Model Terms

AI

AI means artificial intelligence. In practical beginner projects, AI usually refers to software that can generate text, summarize information, classify content, answer questions, create images, help with code or automate tasks.

AI Model

An AI model is the system that produces the response. Examples include text models, image models, speech models and code models. A project may call a model through an API or run a model directly on infrastructure.

LLM

LLM means large language model. It is a type of AI model trained to understand and generate language. LLMs are commonly used for chatbots, writing assistants, summarization, coding help and AI agents.

Prompt

A prompt is the instruction or input given to an AI model. A better prompt usually gives clearer context, rules, examples and desired output format.

Context Window

The context window is the amount of information an AI model can consider at one time. A larger context window can handle longer documents, but it still has limits and does not replace good structure.

Token

A token is a small unit of text processed by an AI model. API pricing is often based on input tokens and output tokens, so token usage matters for cost planning.

Advanced AI and Data Terms

Inference

Inference is the process of using an AI model to produce an answer, prediction or output. When a chatbot replies to a prompt, that is inference.

Training

Training is the process of teaching an AI model from large amounts of data. Training usually requires significant compute power and is different from simply using an existing model through an API.

Fine-Tuning

Fine-tuning means adapting an existing AI model with additional examples so it performs better for a specific task, style or domain. It is not always necessary; many beginner projects can start with better prompts and retrieval instead.

Embedding

An embedding is a numerical representation of text, images or other data. Embeddings help software compare meaning, find similar content and power search systems used in AI applications.

Vector Database

A vector database stores embeddings and helps find similar information quickly. It is often used when an AI app needs to search documents, knowledge bases or previous records.

RAG

RAG means retrieval-augmented generation. It is a method where an AI system first retrieves relevant information, then uses that information to produce a better answer.

AI Agents and Automation Terms

AI Agent

An AI agent is an AI-powered system that can follow steps, use tools, make decisions within limits and produce an output. A safe beginner agent usually creates drafts, logs actions and asks for approval before doing anything risky.

Tool Use

Tool use means an AI system can call external tools, such as search, calculators, databases, APIs, websites, email systems or WordPress actions. Tool access should be limited and monitored.

Workflow

A workflow is a sequence of steps. For example: receive input, classify the task, call a model, create a draft, ask for human approval and save a log.

Automation

Automation means software performs repeated tasks without manual action every time. Automation can save time, but it should have limits, logs and rollback options.

Human Approval

Human approval means a person reviews and approves an AI action before it becomes final. This is important before publishing content, sending messages, deleting data or spending money.

Guardrails

Guardrails are rules and limits that keep an AI workflow safer. Examples include budget limits, approval steps, blocked actions, private data filters and stop conditions.

Cloud, Hosting and Infrastructure Terms

API

An API is a way for one software system to communicate with another. In AI projects, an app may call a model API to send a prompt and receive an AI response.

API Key

An API key is a secret credential used to access an API. It should be stored safely, never shared publicly and limited whenever possible.

Hosting

Hosting is where a website or app lives online. Basic hosting is often enough for websites, landing pages and simple content projects.

VPS

VPS means virtual private server. It gives more control than basic hosting and is often used for apps, bots, APIs, background jobs, Docker containers and AI agent workflows.

GPU

GPU means graphics processing unit. In AI, GPUs are useful for heavy computation such as model training, local inference, image generation or running large models directly.

GPU Cloud

GPU cloud means renting GPU-powered servers from a provider instead of buying hardware. It can be useful for short experiments, but costs can grow quickly if sessions are left running.

Serverless

Serverless means the provider manages the server infrastructure and charges based on usage. It can be useful for event-based workloads, but pricing and limits should be understood before scaling.

Docker

Docker is a tool for packaging software and its dependencies into containers. It helps projects run consistently across development, VPS and cloud environments.

Web, Server and Deployment Terms

Domain

A domain is the human-readable address of a website, such as example.com. It points users to the correct website or service.

DNS

DNS means domain name system. It connects a domain name to the server or service where the website, app or email system is hosted.

SSL Certificate

An SSL certificate helps secure the connection between a visitor and a website. It is what allows a site to use HTTPS instead of only HTTP.

Webhook

A webhook is a way for one service to send data to another service when something happens. For example, a form submission can trigger an automation workflow through a webhook.

Cron Job

A cron job is a scheduled task that runs automatically at a chosen time or interval. It can be used for backups, reports, cleanup tasks or recurring AI workflows.

Uptime

Uptime is the amount of time a website, server or app is available and working. Higher uptime is important for production systems.

Latency

Latency is delay. In AI apps, latency may mean how long it takes for a request to reach the model and return a response.

Bandwidth

Bandwidth is the amount of data transferred between users, servers and services. High traffic, file downloads, images and video can increase bandwidth usage.

Security, Backup and Production Terms

Environment Variable

An environment variable is a setting stored outside the main code. It is commonly used for API keys, database URLs, feature flags and deployment settings.

Secret

A secret is sensitive information such as an API key, password, token or private credential. Secrets should not be published in code, screenshots or public repositories.

Backup

A backup is a saved copy of a website, database, server or project. Backups help recover from mistakes, plugin problems, broken updates or data loss.

Staging Environment

A staging environment is a private test copy of a website or app. It lets builders test changes before applying them to the live site.

Production

Production means the live version of a website, app or workflow that real users can access. Production systems need stronger monitoring, backups and rollback plans.

CI/CD

CI/CD means continuous integration and continuous deployment. It is a development process that helps test and deploy changes more reliably.

Incident

An incident is an unexpected problem that affects a system, such as downtime, data exposure, broken automation, high API spend or failed deployment.

Cost, Safety and Monitoring Terms

Input Tokens

Input tokens are the tokens sent to an AI model. They include the prompt, instructions, context and any text the model must read before answering.

Output Tokens

Output tokens are the tokens generated by the AI model as its answer. Longer answers usually use more output tokens and may cost more when using paid APIs.

Rate Limit

A rate limit controls how many requests can be made in a certain time period. Rate limits help protect systems from overload and unexpected API spending.

Budget Limit

A budget limit is a spending cap for a project, API key or cloud account. It helps prevent surprise bills when usage grows or a workflow loops unexpectedly.

Logs

Logs are records of what happened inside a system. AI workflow logs may include input, model used, tool called, output, error message, approval result and cost estimate.

Monitoring

Monitoring means watching a system for errors, unusual usage, downtime, high cost, failed jobs or risky outputs. Monitoring is important before going live.

Rollback

Rollback means returning a system to a previous safe state. This can include restoring a backup, disabling a plugin, revoking an API key or stopping a server.

Least Privilege

Least privilege means giving a tool or user only the minimum permissions needed. This reduces damage if a key, account or workflow is misused.

Related GPUJet Guides

  • Start Here — beginner path for AI, cloud and GPU learning.
  • AI Infrastructure Hub — central guide for agents, APIs, VPS, GPU cloud and costs.
  • Cloud — hosting, VPS, API-first AI and GPU cloud comparison.
  • Prices — AI API and infrastructure cost planning.
  • AI Agent — how AI agents use tools, workflows, logs and guardrails.
  • Tutorials — step-by-step learning resources.

This glossary is educational and will be expanded over time as GPUJet adds new tutorials and infrastructure guides.