This glossary defines key terms in Artificial Intelligence, Machine Learning, Data Science, and Software Engineering.
A
Activation Functions
Functions that enable neural networks to learn non-linear relationships between features and the label.
Popular activation functions include ReLU and Sigmoid.
For more details, refer to Google Developers ML Course and Keras Activations.
Adenocarcinoma
Adenocarcinoma is a type of cancer that starts in glandular (secretory) cells—the cells that produce and release fluids like mucus. It can occur in organs such as the lungs, colon, breast, prostate, or pancreas.
Example: The biopsy revealed that the tumor was an adenocarcinoma of the colon.
Anecdote
A short amusing or interesting story about a real incident or person.
Example: He told anecdotes about his job.
Antibodies
Antibodies are proteins produced by the immune system that recognize and attach to foreign substances like bacteria, viruses, or toxins to help destroy them. They are a vital part of your immune defense system.
Example: After getting vaccinated, your body produces antibodies to fight off the virus in the future.
Synonyms / Related Terms: • Immunoglobulins (technical term) • Disease-fighting proteins • Immune proteins
Architecture
The skeleton of the model — the definition of each layer and each operation that happens within the model.
Example: BERT is an architecture while bert-base-cased
, a set of weights trained by the Google team for the first release of BERT, is a checkpoint. However, one can say “the BERT model” and “the bert-base-cased
model.”
B
Bilateral ureteral obstruction
In patients with prostate cancer:
Bilateral ureteral obstruction in prostate cancer
means that both kidneys are at risk because urine cannot drain properly. It’s a serious complication that needs urgent management to prevent kidney failure.
Bilateral ureteral obstruction in patients with prostate cancer means that both ureters—the tubes that carry urine from the kidneys to the bladder—are blocked, usually because the cancer is compressing or invading them.
What causes this in prostate cancer? The prostate is located just below the bladder, and when prostate cancer: • Grows large • Invades nearby tissues • Or spreads to lymph nodes
…it can press against or invade the ureters, especially where they pass close to the prostate.
Why is this dangerous?
- Urine backs up into the kidneys, causing hydronephrosis (kidney swelling)
- Leads to kidney damage or failure if untreated
- Can cause fatigue, nausea, flank pain, and low urine output
Treatment options:
- Ureteral stents (small tubes inserted to keep ureters open)
- Nephrostomy tubes (drains urine directly from kidneys through the back)
- Treating the underlying cancer (e.g. hormone therapy, radiation)
Breakthrough Device Designation (by FDA)
This is not a full approval but a special status granted to devices that: • Provide more effective diagnosis or treatment for serious/life-threatening diseases • Have the potential to address unmet medical needs
It gives: • Faster review process • Closer communication with the FDA • Priority support during development and review
In March 2019, Paige.AI was granted the breakthrough device designation by the US Food and Drug Administration (FDA) for its AI system in cancer diagnosis.
Paige.AI received this status in 2019, meaning the FDA saw its AI cancer diagnostic system as high-potential and innovative.
C
Causal
Relating to or acting as a cause.
Example: Causality, Causal Effect
Pronunciation: kaw·zl
Causality
The relationship between cause and effect.
Example: Causality, Causal Effect
Pronunciation: kaw·za·luh·tee
Conformité Européenne (CE Mark)
• The CE mark is required for medical devices (including software) sold in the European Economic Area (EEA). • It means the device meets EU safety, health, and environmental protection standards.
A CE mark allows you to legally market a device in Europe.
Coarse
Rough or harsh in texture or structure, inferior quality. Can also refer to a person or their speech being rude or vulgar.
Example: The pointwise evaluation prompt approach might be coarse for our system.
Similar words: rough, bristly, scratchy, rude
Opposite words: fine
Pronunciation: kaws, kors
D
De Novo Clearance (by FDA)
This is an FDA regulatory pathway used to approve low- to moderate-risk medical devices that: • Are novel (new to the market) • Have no predicate device (i.e., nothing similar already approved) This pathway allows companies to bring new and innovative medical devices to market.
Once granted: • The product is fully cleared for use • A new device type is created, which future similar products can reference
Paige Prostate Detect (PPD) has received de novo clearance (DEN200080) from the FDA in September 2021
It became the first FDA-cleared AI software for detecting prostate cancer in digital slides.
Distant metastasis
In patients with prostate cancer
Distant metastasis in prostate cancer
means the disease has spread far from the prostate and is now considered advanced or metastatic prostate cancer. It requires systemic treatment and close monitoring.
In prostate cancer, distant metastasis means that the cancer has spread beyond the pelvis to organs or tissues that are far from the prostate, such as: • Bones (especially the spine, ribs, hips, pelvis, and long bones) • Lungs • Liver • Distant lymph nodes (outside the pelvic area)
How does it happen? Prostate cancer cells can: • Enter the bloodstream or lymphatic system • Travel to distant organs • Form new tumors at those sites
Staging context: • Distant metastasis corresponds to Stage IV (advanced stage) prostate cancer • In TNM staging, it’s often labeled as M1 (M = metastasis)
E
Ecological Fallacy
Occurs when one draws conclusions about individuals based solely on group-level data.
Example: Ecological Fallacy
Ellipsis
A punctuation mark consisting of a series of three dots. An ellipsis can be used in many ways, such as for intentional omission of text or numbers.
Example: A set of dots (…) indicating an ellipsis.
Empirical
Based on, concerned with, or verifiable by observation or experience rather than theory or pure logic.
Example: They provided considerable empirical evidence to support their argument.
Opposite words: theoretical, non-empirical
F
Fallacy
A mistaken belief, especially one based on unsound arguments. An idea which many people believe to be true, but which is in fact false because it is based on incorrect information or reasoning.
Example: The notion that the camera never lies is a fallacy.
Similar words: misconception, misbelief, delusion
Pronunciation: fa·luh·see
FDA (Food and Drug Administration, USA)
The regulatory agency in the United States that approves drugs, medical devices, and diagnostics.
Fidelity
The degree of exactness with which something is copied or reproduced.
G
Genitourinary
Genitourinary refers to the organs of the reproductive and urinary systems. It combines “genital” (related to reproduction) and “urinary” (related to urine and the urinary tract).
Example: The patient was referred to a genitourinary specialist for evaluation of kidney and bladder issues.
Similar words: Urogenital, Reproductive and urinary
Genitourinary Pathologists
What do they do?
- Diagnoses cancers (e.g., prostate cancer, kidney cancer, bladder cancer)
- Identifies non-cancerous conditions (like infections, inflammation, or benign tumors)
- Works closely with urologists and oncologists to guide treatment decisions
I
Immunohistochemistry (IHC)
Immunohistochemistry (IHC) is a lab technique used to detect specific proteins in tissue samples using antibodies.
It’s a key method in diagnosing cancers, infections, and autoimmune diseases by revealing how cells behave under the microscope.
Example: The immunohistochemistry test confirmed the presence of cancer markers in the tissue sample.
Synonyms/Related terms
- Immunohistology (synonym)
- Antibody staining (informal synonym)
- Biomarker detection
- Tissue analysis
Break it down:
- Immuno- = immune system (antibodies)
- Histo- = tissue
- Chemistry = reactions So: Immunohistochemistry = using antibodies in chemistry to study tissues!
For more details, refer https://www.biomol.com/resources/applications/immunohistochemistry/, https://oncodaily.com/oncolibrary/immunohistochemistry
Incidence
The occurrence, rate, or frequency of a disease, crime, or other undesirable thing.
Example: An increased incidence of cancer.
Ingest
To absorb (information) or take (food, drink, or another substance) into the body by swallowing or absorbing it.
Example: He spent his days ingesting the contents of the library.
Similar words: absorb, consume, eat
Inter-reader vs. Intra-reader variability
Term | Definition |
---|---|
Inter-reader variability | Differences between different pathologists (pathologist A vs. B) interpreting the same slide. Used to assess consistency across experts |
Intra-reader variability | Differences within the same pathologist (pathologist A vs. A again), interpreting the same slide at different times. Uses to test how reliable an individual expert is over time. |
Why it matters: | |
• High inter-reader variability can point to ambiguity in diagnostic criteria or training gaps. | |
• High intra-reader variability may raise concerns about fatigue, bias, or complexity of the case. | |
• Both are critical when validating: |
• AI pathology models
• Clinical guidelines
• Consensus diagnoses
Intractable
Hard to control or deal with.
Example: The optimization problem might be intractable to solve.
Similar words: unmanageable, ungovernable, out of control
L
Lilliputian
A trivial or very small person or thing.
Example: Lilliput is the name of a fictional island whose people, the Lilliputians, stand only about six inches high.
M
Metastasis/ Metastasise
To metastasise means for cancer cells to spread from the original (primary) site to other parts of the body through the blood or lymphatic system. This process forms secondary tumors, often in vital organs like the liver, lungs, or bones.
Example:
- The oncologist explained that the tumor had begun to metastasise to the lungs.
- The tumour will metastasise if not treated early.
“meta” = beyond and “stasis” = standing/place 💡 So: Metastasise = cancer goes “beyond its place.”
Mnemonic
A mnemonic is a memory aid — a tool, trick, rhyme, acronym, or phrase that helps you remember information more easily.
They are super useful for memorizing complex topics, vocabulary, medical terms, or lists.
Mnemonic Tip to Remember “Mnemonic”:
- Even though it starts with an “m”, it’s silent—just like memory hides the m in “mnemonic!” Think: Mnemonic = Memory aid!
Morbidity
The condition of suffering from a disease or medical condition.
Example: the therapy can substantially reduce respiratory morbidity in infants.
O
Oncology
Oncology is the branch of medicine that deals with cancer—its diagnosis, treatment, and research.
Oncologists
Doctors who specialize in the field of oncology are called oncologists. A doctor who treats cancer patients.
Example: She decided to specialize in oncology to help patients fighting cancer.
P
Pangram
A sentence containing every letter of the alphabet.
Example: “The quick brown fox jumps over the lazy dog.” is a pangram.
Pathology
Pathology is the study of diseases—their causes, effects, and development. It can also refer to the conditions caused by a disease.
Example:
- She studied pathology to understand how cancer spreads in the body.
- The report showed severe pathology in the liver tissue.
Synonyms: • Disease study • Medical science (when related to disease) • Morbidity (when referring to the condition itself)
Pathologist
A doctor who studies disease by analyzing samples such as: • Biopsies • Surgical specimens • Cytology (cells from urine, for example)
Prompt Engineering
Engineering a prompt so that LLM does what we want.
By better crafting our prompts, we can improve the quality of results.
Prostate
The prostate is a small gland in males, located just below the bladder and in front of the rectum. It produces seminal fluid, which helps nourish and transport sperm during ejaculation.
It’s a part of the male reproductive system and is often discussed in relation to prostate cancer or benign prostatic hyperplasia (BPH).
Example: The doctor recommended a prostate exam because of the patient’s age and symptoms
Related Terms: • Prostate gland (full name) • Prostate cancer (a common male cancer) • Prostatitis (inflammation of the prostate)
Psuedo-Labels
Automatically generated labels from the data itself. These labels are not manually annotated but are inferred based on the inherent structure or attributes of the data.
In other words, the model automatically generates the labelled data.
Examples in practice:
- NLP: BERT uses MLM (Masked Language Modeling), GPT predicts the next word in a sequence (in CLM)
- Vision: SimCLR, MAE
- Speech: Wav2Vec
Example of a Masking/Prediction Task: In NLP, when training a model like BERT, random words in a sentence are masked. The model is trained to predict these masked words using the context (MLM).
- Input: “The cat is ___ the table.”
- Pseudo-label: “on.”
Pydantic
Pydantic is the most widely used data validation library for Python.
For more details, refer to the official website.
R
Rectified Linear Unit (ReLU)
A popular activation function that outputs the input directly if it’s positive, otherwise it outputs zero.
Recurrent Neural Network (RNN)
A type of neural network designed for sequential data processing, where connections between nodes form a directed graph along a temporal sequence.
Regularization and Regularization Rate (λ)
One approach to keeping a model simple is to penalize complex models; that is, to force the model to become simpler during training. Penalizing complex models is one form of regularization.
The training optimization algorithm works as:
A regularization rate (lambda) controls the strength of regularization, with higher values leading to simpler models and lower values increasing the risk of overfitting.
Formula:
Regularization types:
- L1 regularization
- L2 regularization
A high regularization rate:
- Strengthens the influence of regularization, thereby reducing the chances of overfitting
- Tends to produce a histogram of model weights with a normal distribution and a mean weight of 0
A low regularization rate:
- Lowers the influence of regularization, thereby increasing the chances of overfitting
- Tends to produce a histogram of model weights with a flat distribution
For more details, refer to Google Developers.
RESTful API
A software architectural style that defines a set of constraints for creating web services.
RoBERTa
RoBERTa (Robustly Optimized BERT Pretraining Approach) is an optimized version of BERT that removes the Next Sentence Prediction task and uses different training configurations.
S
Self-attention
Self-attention is the mechanism that lets each word in a sentence focus on every other word, assessing their relationships to form a context-aware representation.
In other words, the self-attention mechanism enables the model to weight the importance of each word in a sequence relative to all other words, thus allowing the model to capture long-range contextual relationships and dependencies.
Both encoders and decoders consist of many layers connected by self-attention mechanisms.
Example: In the phrase “It was a bright sunny day,” the word “bright” can attend to “sunny” and “day” due to self-attention, achieving a bi-directional context that enriches its meaning. This bi-directional influence is essential for accurate tasks like named entity recognition and extractive question answering.
Self-supervised learning
A type of learning in which the objective is automatically computed from the inputs of the model. In other words, a type of machine learning where models learn to predict parts of data from other parts, without labels. That means humans are not needed to label the data.
Often used in natural language processing (NLP) and computer vision to pre-train models on large datasets.
Example: Transformer models like GPT, BERT, BART, T5, etc. have been trained as language models on large amounts of raw data in a self-supervised fashion. This type of model develops a statistical understanding of the language it has been trained on, but it’s not very useful for specific practical tasks. Because of this, the general pretrained model then goes through a process called transfer learning.
Semantic Parsing
Converting language into structured data, often for databases or code.
Example: T5 converts language into structured data (e.g., SQL queries).
Sentiment Analysis
Detecting the emotional tone of text. Identifying and classifying entities like names, places, and dates.
Example: BERT-based Sentiment Classifier analyzes text for sentiment polarity.
Sequence-to-sequence transformer models
Transformer models that can handle input sequences and generate output sequences, typically used for tasks like translation, summarization, and text generation.
Examples: BART, T5
Sigmoid
An activation function that maps any real number to a value between 0 and 1, commonly used in binary classification problems.
Special Purpose LLMs
Large Language Models highly trained to focus on a single or small set of tasks. This is in contrast to General Purpose LLMs.
Example:
- Raven-13B: Tuned to provide function calling services
- Smaller & lower latency than general purpose LLMs
Spinal Metastasis
In patients with prostate cancer Spinal metastasis means that the cancer has spread from the prostate to the bones of the spine.
What does this mean for the patient? 1. Prostate cancer often spreads to bones, and the spine is one of the most common sites. 2. The spread happens through the bloodstream or lymphatic system. 3. It usually affects the vertebrae (bones of the spine), not the spinal cord itself (though the cord can be compressed).
Why is this serious? • Indicates advanced (stage 4) cancer • May require radiation therapy, hormone therapy, surgery, or pain management • Early detection is important to prevent spinal cord compression, which can cause permanent nerve damage
Summarization
Producing concise summaries of longer content.
Example: BART generates concise summaries for long texts.
Succinct
Briefly and clearly expressed.
Example: Use short, succinct sentence.
Similar words: concise, short, brief, compact
Pronunciation: suhk·singkt
T
Text Classification
Categorizing text into predefined labels.
Text Completion
Generating the continuation of a given text.
Example: GPT-3 completes text based on context.
Transfer learning
A technique where a model developed for one task is reused as the starting point for a model on a second, related task. In other words, a process in which a (general) pretrained model is fine-tuned in a supervised way - that is, using human-annotated labels - on a given task.
The principle of transfer learning is to leverage knowledge from one domain (source task) to improve learning in another (target task).
Example: Often used in NLP and computer vision, where pre-trained models (e.g., BERT, ResNet) are fine-tuned for new tasks.
Transformer Model
A deep neural network (DNN) architecture primarily used for natural language processing (NLP) tasks. Transformer models rely on attention mechanisms to process data, enabling parallelization and improved performance on sequential data.
Example: Widely used in NLP tasks like translation, text generation, and question-answering; includes models like BERT and GPT.
Trivial
Of little value or importance.
Example: Huge fines were imposed for trivial offences.
Similar words: unimportant, insignificant, minor
Opposite words: non-trivial
U
Uni-directional attention
Decoder models operate in a uni-directional manner, meaning they only consider the context from the left (previous words) when making predictions. They do not access future words, in contrast to the bidirectional attention used in encoder models.
UK Conformity Assessed (UKCA Mark)
- This is the UK’s version of the CE mark that’s required for medical devices (including software), created after Brexit. • It applies to products sold in England, Scotland, and Wales.
A UKCA mark ensures the product meets UK-specific safety and regulatory requirements.
V
Vanishing gradient
A problem in training deep neural networks where gradients become exponentially small as they propagate back through the network layers, making it difficult for the network to learn long-range dependencies.
For more details, refer to Google Developers ML Course.
W
Wintering
(Especially of a bird) spend the winter in a particular place.
Example: birds wintering in the Channel Islands.