Sparql tool – connect LLM to knowledge graphs

A classic study room with shelves of books, a gavel, and Lady Justice figurine on a green table.

Talk to knowledge graph – sparql-tool

Use LLM.: Hallucinations (sometimes good for creativity), Outdated knowledge, no access to your data ( trained on public knowledge).

Solutions:

Fine tuning – retrain the model on domain data

RAG –

Search: look things up in real time. Claude code looks at searched doc.

Vector embeddings: semantic similarity search over documents.

Knowledge graphs: structure queryable machine readable knowledge

Two main flavors:

Property graphsQ: nodes and edges carry key value properties

Nodes and edges carry properties – Neo4q, TigerGraph

RDF graphs: everything is triple – Subject – predicate – object

Engines

Datasets:

Everthing is URI

So for example for age – do not put the exact age – put the date of birth and then compute the age.

The RDF ecosystem:

RDF

OWL (ontology language

Triple stores – databases for RDF

Ontologies (formal schemas

SPARQL – language for RDF

Problem with RDF

Low traction for academia

steep learning curve

SPRQL is tedious by hand

Ontologies are complex

LLM models have this formal knowledge baked in (claude etc)

They understand RDF, OWL, SPARQL, ontologies

How to use them effectively.

Sparql-tool

Democratizes RDF

Three components : skill, agent, CLI tool

CLI does not need LLM

This is replacing MCP.

Websearch sometimes does not work – because there was semantic loss on what was being searched for

By the way, people are complaining because people aren’t going to websites, they are going to LLM’s.

DBPedia KG – they extract wikipedia very often…

Try Kevin Bacon algorithm and show hops is one way to test things.

Biotech : Uniprot

Uniprot KG. (they publish about proteins)

What proteins are associated with Alzhiemer’s

So then the better question is tho what protein intearctis with what proteins ..Takes long.

Another one is used – IntAct…

Have to know where the dataset came from ?

Pokemon: Knowledge graph.

Community graphs – has 100k entries. Used that as such.

Similar Posts

  • | |

    Functional prediction of microbial sequences

    Even with E.coli and M.tuberculosis we only know ½ of them. Can you use ML model to define function: as natural language, or molecular interaction or chemical reaction. Function as molecular interaction: protein – protein interaction. Genomics: Learn association between genes (just like words). It is called gLM2. A multi-modal single residue resolution gLM. GLM2…

  • |

    Virtual cell

    Hani Goodrazi – Arc institute has been working on virtual cell. Drugs fail due to overfit experimental models, You need screen drugs with better models of human biology. Geneva is platfrom that brings tumor models into perturbation model – that is a transcriptomics assay that deconvolves into effect. Take multiple cell line and then treat…

  • |

    Knowledge graphs

    Knowledge graph is a way of representing information where entities/nodes (people, places, products, concepts) are linked by relationships/edges (works, creates, has ). It is a semantic network that captures facts and context. There is also an ontology that defines different types of node/edges and defines what types exist and how they relate. These are used…

  • DeepSpot

    Kalin Klonchev – the winner of a competition for AI based data analysis from Broad in 2024 had also created a tool called DeepSpot. Worth looking at for spot analysis of H&E sections by converting a full H&E slide pictures to “spots” which are analyzed. Some good links: DeepSpot paper: https://www.medrxiv.org/content/10.1101/2025.02.09.25321567v1 DeepSpot GitHub repository: https://github.com/ratschlab/DeepSpot…

  • AI Automations

    The AI automations have only increased. There is one interesting one that has been receiving publicity. Check it out: https://knowledgework.ai It takes notes while the person is working and becomes the second brain. Privacy and access may be of concern but capability is available with AI tools.