Tutorial 05 — Working With Complex Documents: Datasheets and Manuals
Tutorial 05 — Working With Complex Documents: Datasheets and Manuals
AI and PicBasic Pro Tutorial Series — Post 5 of 6
Content produced with AI assistance and reviewed by the forum administrator.
How AI Reads a Document — Not the Way You Do
When you open a 400-page datasheet you navigate it the way an experienced engineer does — straight to the oscillator section when configuring the PLL, straight to the CONFIG register table when a bit is behaving unexpectedly. Reading it end to end would take a day and retain very little.
When you upload that same document to an AI workspace, the AI does not read it sequentially. It has access to the full text and can locate and cross-reference specific sections in response to a query — without fatigue, without losing context, and without the sequential constraint that makes large technical documents laborious for humans.
This means the argument for simplifying a document before giving it to the AI — to make it more digestible — does not apply. The AI does not need it simplified. What it does need is for the right content to be present and findable.
The Real Constraint — Context Window Size
Every AI conversation has a context window — a limit on how much information can be active at once. When the AI is working on your code, that window contains your code, the conversation history, the workspace instruction, and content retrieved from your documents. Everything competes for the same space.
A complete microcontroller datasheet converted to text can run to tens of thousands of lines. When the AI retrieves content from it, it pulls chunks into the active context. If those chunks contain electrical characteristics tables, package diagrams, and programming specifications that have nothing to do with writing PBP3 code, that space is wasted.
A targeted extract covering only the sections relevant to PBP3 development uses that context space efficiently. Every line it occupies is earning its place.
Recommended approach: keep the full datasheet in the workspace for edge cases, and add a targeted extract as the primary reference the AI reaches for first.
What to Extract From a PIC Datasheet
The following sections cover the majority of questions that arise in PBP3 development. They apply to most PIC devices — the specific register names will vary but the categories are consistent:
- Configuration registers — CONFIG word bit definitions, all words. Every project starts here and CONFIG errors are the most common AI-generated fault category.
- Oscillator section — OSCCON register, any software PLL control bits, the relationship between configuration bits and runtime oscillator control. Timing errors trace back here.
- Weak pull-up control — OPTION_REG or equivalent, individual port pull-up registers. Floating input faults trace back here.
- Timer registers — whichever timers your project uses, including the period calculation formula. Essential for ISR timing verification.
- Interrupt control registers — INTCON, PIE, PIR for relevant peripherals. The AI needs these to verify ISR enable sequences and flag-clear logic.
- Port registers — TRIS, PORT, LAT, ANSEL for each port. Pin direction, analog/digital selection, output latches.
- Any shared-pin peripherals — LCD controller pins, CCP pins, PPS if the device supports it. These interactions cause silent failures that are hard to trace without the datasheet.
Asking the AI to Do the Extraction
You do not need to do this manually. The AI can extract the relevant sections from a datasheet and produce a clean structured text document ready to upload to your workspace. Do this in a standard conversation outside your project — you want a clean context for this task.
Upload the datasheet PDF and send a prompt similar to this:
Code:
; ============================================================
; DATASHEET EXTRACTION PROMPT — TEMPLATE
; Replace [YOUR DEVICE] with your device part number
; Adjust the section list to match your device's register names
; ============================================================
I am setting up an AI workspace for PicBasic Pro 3 development
targeting [YOUR DEVICE].
Extract the following sections from the attached datasheet and
produce a single clean structured text document I can upload to
my workspace as a reference file.
Extract these sections completely, preserving all register bit
definitions, bit names, valid values, and any notes or warnings:
1. All Configuration Word register definitions (all CONFIG words)
2. OSCCON register and any software PLL/oscillator control bits
3. Oscillator modes section — clock sources, PLL operation
4. OPTION_REG or equivalent — complete bit definitions
5. Weak pull-up registers for all ports
6. Timer registers used in your projects — T2CON, PR2, etc
7. Interrupt control registers — INTCON, PIE1, PIR1
8. TRIS, PORT, LAT, ANSEL registers for all ports
9. Any peripheral registers that share pins with general I/O
Format the output as:
- A clear heading for each register or section
- Register address where available
- Each bit listed with: number, name, description, valid values
- Notes and cross-references from the datasheet preserved
- No summaries or editorial comment — extracted content only
Output the complete document inside:
===REGISTER REFERENCE START===
(content here)
===REGISTER REFERENCE END===
NOTE: This is a template prompt. Before using it, replace [YOUR DEVICE] with your actual device part number and adjust the section list to match the register names in your specific datasheet. Verify the extracted content against the datasheet before uploading it to your workspace.
Copy the content between the markers, save it as a plain text file with a clear name such as YourDevice_Register_Reference.txt, and upload it to your workspace.
Verifying the Extract
Before uploading, spot-check the extracted document against the datasheet on two or three registers you know well. Look for:
- Bit names that match exactly — not approximations of the name from the datasheet
- Bit polarity described correctly — some enable bits are active low, some active high, and the AI can get this wrong
- Any interactions between registers noted — for example, a global enable bit that must be cleared before individual bits take effect
- All CONFIG words present and complete
If anything is missing, add a follow-up in the same conversation asking the AI to re-extract that specific section. Paste the correction into your file before uploading.
The PBP3 Manual
The PBP3 manual is equally important and requires no extraction — upload the full manual PDF directly. It is not as large as a datasheet and its content is uniformly relevant to PBP3 development. The AI benefits from having the complete manual rather than a subset.
The manual is available from the melabs website at https://www.melabs.com.
The Complete Document Stack
A well-equipped workspace contains:
- Workspace instruction — tells the AI to use documents as authority, not memory
- PBP3 manual PDF — complete language and compiler reference
- Device register reference TXT — targeted extract, primary reference for register queries
- Full device datasheet PDF — complete reference for edge cases
- Working code samples — style baseline for the target device
Each document earns its place. Together they give the AI a working reference for PBP3 development on your target device that is grounded in authoritative sources.
Previous: Tutorial 04 — A Structured Three-Prompt Debugging Workflow
Next: Tutorial 06 — Putting It All Together
This tutorial series was produced with AI