# AI == Compression
# [Mayday 2025](https://www.cbs8.com/article/news/local/ucsd-health-workers-set-to-walk-off-the-job-wednesday-morning/509-e5472039-79d9-4e2f-b9bd-beb775ebb8c2)
#### Hobson Lane #### UCSD RET --v-- ## [clueo.net](https://indica.clueo.net/) #### Job Crowder and #### Alicia Chen
## Engineers - Taylor Kirk - Jason O'Dell --v-- ## RET - Research Experience for Teachers Teaching AI to San Diego teachers and students ### [Gary Cottrell]((https://cseweb.ucsd.edu/~gary/)) ### UCSD CS Dept with NSF support --- ## Agenda 1. **[Assignment](#student-assignment)** 2. [Get serious](#get-serious) 3. [Decision trees](#decision-trees) 4. [Hybrid code networks](#hybrid-code-networks) 5. [State-space models](#state-space-models) 6. [Python tricks](#python-tricks) --v-- ## Student assignment 1. Text adventures
1.1 **[Why](#why-text-adventure)**
1.2 **[Example](#plan)**
1.3 **[Natural language](#natural-language)**
1.4 **[Auto-grade](#auto-grade)**
1.5 **[REPL-grade](#repl-grade)**
--v-- ## [Text adventures](#student-asignment)
#### Build a **Text Adventure (Chatbot)** from scratch
- keywords: `if`, `print()`, `input()`, `=` - bonus keywords: `def`, `in`, `str.*()` --v-- ## [Why text adventure?](#agenda) GenX got their start w/ text adventures - _DnD_ - Character sheets - Die rollers - Dungeon Masters - _Oregon Trail_ - _Colossal Cave Adventure_ - _Rogue_ --v-- ## [Plan](#agenda) #### Student design
#### [Game plan == dialog plan]() --v-- ## Example ```python print('You are a student at Mesa and you want to go swimming!') resp = input("You're in SD, which way do you go (N/S)?") if resp[0] == 'N': resp = input("You're in LA, Which way now (E/W)?") if resp[0] == 'E': resp = input("You're in Vegas. You lose ;(") elif resp[0] == 'W': resp = input("You're at the beach! You win! :-)") print('Invalid direction in LA. You lose!') elif resp[0] == 'S': resp = input("You're in the TJ. Which way now (E/W)?") if resp[0] == 'E': resp = input("You're in the desert. You lose ;(") elif resp[0] == 'W': resp = input("You're at the coast! You win! :-)") print('Invalid direction in TJ. You lose!') print('Game Over') ``` --v-- ## Natural language
**More natural UX**
`.lower()` `.strip()`
`.replace(' ', '')` `.startswith(answer)`
`answer in message` **[Keyword search]()** --v-- ## Auto-grade - **Pylint** code quality score - Open loop integration test ```bash $ python -c 'print('N') ; print('W') ; | python game.py $ cat player_input.txt | python game.py ``` --v-- ## Close the loop with `pexpect` ```python import pexpect child = pexpect.spawn('python game.py') while child: child.expect('.*: ') command = select_action(text) # 'north' child.sendline(command) ``` --v-- ## REPL-Grade ###
R
###
E
###
P
###
L
--v-- ## Thoughtful feedback
R ead
### **E** ### **P** ### **L** --v-- ## Thoughtful feedback ### **R** ead
E val
### **P** ### **L** --v-- ## Thoughtful feedback ### **R** ead ### **E** val
P rint
### **L** --v-- ## Thoughtful feedback ### **R** ead ### **E** val ### **P** rint
L oop
--v-- ## Mindful grading ### **R** ead ### **E** val
**P** lay
--- ## Agenda 1. [Assignment](#student-assignment) 2. **[Get serious](#get-serious)** 3. [Decision trees](#decision-trees) 4. [Hybrid code networks](#hybrid-code-networks) 5. [State-space models](#state-space-models) 6. [Python tricks](#python-tricks) --v-- ## Getting serious 2. Business applications
2.1 **[Serious examples](#examples)**
2.2 **[Vibin examples](#vibin)**
2.3 **[Serious vibin](#serious-vibin)**
2.4 **[Serious grounding](#serious-grounding)**
--v-- ## Examples - Phone tree - Autocomplete - Therapy - e.g. ELIZA, Wobot - Chat ops - Search --v-- ## Vibin Full monty naked LLMs - _Grok_ - famously guardrail free - _ChatGPT_ - relaxing guardrails - _Deep Seek_ - breached all customer data within days of launch --v-- ## Serious vibin - RAG & LAS - LLM augmented search - CLI assistant (``shy-sh``) - Vibe assistant - _Copilot_ - _Cursor_ - _Aider_ - Homework copypasta - _ClaudeCode_ - _Gemma_ --v-- ## Generalization & abstraction - Lossy concept compression - Concept representation matters - Stereotypes - Biases --v-- ## JPG compression is not AI
#### Transformer compression is not AI --v-- ## Serious grounding Grounding with NLP & Data Science - Classification - Class-specific templates - Entity extraction - Slot filling - Templates with slots filled --v-- ## Grounding Approaches - RLHF (Reinforcement Learning - Human Feedback) - Fine tuning (LoRA) - forgets common sense - More context #### Better approaches - Less context + small models - Hybrid --v-- ## More context - RAG - Few shot interpolation - Examples of what you want - Examples of what you don't want --v-- ## Therapist training
--v-- ## Artificial Intelligence? There's only one problem with AI... --v-- ## Artificial Intelligence? There's only one problem with AI... It's trained to not be dumb: - Only knows about tokens NOT: - Numbers - Math - Logic - Counting - Physical objects - Common sense logic - Geometry - Rules --v-- ## Why? - Incorrect generalizations - Representation problem --v-- ## Generalization Generalization == lossy compression - False negatives (metadata, forensics) - False positives (watermarks) --v-- ## Good compression
Mayfield, KY Candle factory - Before and after a Tornado --v-- ## Over generalization - Ignores critical details - Retains irrelevant noise - Retains incorrect generalizations (biases) --v-- ## LLM reasoning - Guesses wildly - "out-of-distribution" sampling - Fuzzifies user intent - Adds ambiguity - Fuzzy database (semantic search) --- ## Agenda 1. [Assignment](#student-assignment) 2. [Get serious](#get-serious) 3. **[Decision trees](#decision-trees)** 4. [Hybrid code networks](#hybrid-code-networks) 5. [State-space models](#state-space-models) 6. [Python tricks](#python-tricks) --v-- ## 3. Decision trees 3. Critical decisions
3.1 **[Attention to NL](#attention-to-NL)**
3.2 **[Vibin examples](#vibin)**
3.4 **[Decision tree](#decision-tree)**
3.5 **[Bayes net](#bayes-net)**
--v-- ## Attend to NL? [ --v-- ## Wrong NL attention
--v-- ## Decision tree
FDA classification of milk nutrition --v-- ## Bayes net #### Expert system for diagnosing cancer
--v-- ## Learned decision trees - Random forest - Neuromorphic programming (deep learning) - Decision root ball - Bayesean belief networks, the book of why --- ## Agenda 1. [Assignment](#student-assignment) 2. [Get serious](#get-serious) 3. [Decision trees](#decision-trees) 4. **[Hybrid approach](#hybrid-code-networks)** 5. [State-space models](#state-space-models) 6. [Python tricks](#python-tricks) --v-- ## 4. Hybrid approach 4. Decision logic + deep learning
4.1 **[Hybrid code networks](#hybrid-code-networks)**
4.2 **[Bamba](#bamba)**
4.3 **[State space model](#state-space-model)**
--v-- ## Hybrid Code Networks
--v-- ## Bamba ### Bamba-v2-9B
[8](#links)
- State Space Model (SSM) & Transformer layers - IBM, Princeton, CMU, UIUC - Fully open source (data, code, weights) - Based on Mamba2 - Subquadratic scaling
### VS Llama-3.1-8B - 2.5x faster (latency & throughput) - 5x less data (3T vs 15T tokens) --v-- ## State-space Model - Mimic the "time cells" in the hippocampus - [Better representation (AIMA)](https://csd.cmu.edu/course/15281/s24) - Predict sequences - Feedback control systems - Scale nearly linearly with sequence length --v-- ## Control system
### Discrete SSM
[9](links#)
### `x[t+1] = A * x[t] + B * u[t]` ### `y[t] = C * x[t] + D * u[t]`
--v-- ## NLPiA Hybrid Networks with Feedback
--- ## Convomeld - Merge conversation logs to create a dialog tree - Plot the dialog plan network (graph) - Load dialog plan to `networkx` - Create drawio dialog plan diagrams w/ drawpy - Execute the dialog plan #### [gitlab.com/tangibleai/community/convomeld](https://gitlab.com/tangibleai/community/convomeld) --v-- ## [indica.clueo.net](https://indica.clueo.net/) - Django admin wrapper - Conversation logs - Exercises - LLM prompts ### TODO: - API for ConvoMeld --v-- ## Reflect-a-bot
--v-- ## ipython aliases #### `~/.ipython/profile_default/startup/my_aliases.ipy` ```python %alias meld meld %alias subl subl %alias which which %alias wc wc %alias find find %alias curl curl %alias grep grep ``` --v-- ### Links 1. ["PAIR: ... Counselor Reflection Scoring in Motivational Interviewing", Min et. al.](https://aclanthology.org/2022.emnlp-main.11.pdf) 2. ["Building a Motivational Interviewing Dataset.pdf", Pérez-Rosas et. al.](https://aclanthology.org/W16-0305.pdf) 3. ["Explaining Bayesian Networks in Natural Language using Factor Arguments." Oct 2024, by Jaime Sevilla et al](https://www.researchgate.net/publication/385176693_Explaining_Bayesian_Networks_in_Natural_Language_using_Factor_Arguments_Evaluation_in_the_medical_domain/fulltext/6719c020edbc012ea138df93/Explaining-Bayesian-Networks-in-Natural-Language-using-Factor-Arguments-Evaluation-in-the-medical-domain.pdf) 4. [Decision-trees-for-the-Start-up-milk-classification](https://www.researchgate.net/profile/Daniel-Lefebvre-2/publication/238725014/figure/fig3/AS:669294299971611@1536583604771/Decision-trees-for-the-Start-up-milk-classification-task-with-a-low-A-medium-B.png) 5. [Indica Django App by Job Crowder & Jason](https://indica.clueo.net/) 6. [On the Biology of LLMs by Anthropic](https://transformer-circuits.pub/2025/attribution-graphs/biology.html) 7. [DeepSeek breach - customer data public ClickHouse DB](https://www.wiz.io/blog/wiz-research-uncovers-exposed-deepseek-database-leak) 8. [Bamba-v2-9B on huggingface](https://huggingface.co/blog/ibm-ai-platform/bamba-9b-v2) 9. [en.Wikipedia.org/wiki/State-space_representation](https://en.Wikipedia.org/wiki/State-space_representation) ---