langchain gpt sql

Explore LangChain, GPT, and Database Integration

Introduction to LangChain

LangChain is a pioneering technology that acts as a conduit between natural language processing (NLP), GPT, and databases such as SQL, MySQL, and Postgres. It empowers users to engage with databases using natural language, making the task of retrieving, manipulating, and managing data simpler without the necessity for complex queries. Utilizing advanced NLP techniques, LangChain can comprehend user queries and convert them into structured queries, ensuring smooth communication with databases through GPT-powered natural language interfaces. This unique integration of LangChain, GPT, and database technologies presents a user-friendly approach to database interaction, bridging the linguistic gap between users and database queries.

Hire the best developers in Latin America. Get a free quote today!

Contact Us Today!

Why LangChain?

LangChain fills a crucial need in the realm of Large Language Models (LLMs). While LLMs excel at a wide range of tasks, they may fall short when it comes to providing specific answers or deep domain expertise. To address this limitation, LangChain adds data-awareness and agentic capabilities to LLMs, allowing applications to connect to external data sources and engage in interactive, context-aware interactions. This integration empowers developers to build more powerful and specialized language model applications that can provide targeted and nuanced responses, bridging the gap between LLMs and domain-specific knowledge requirements.

LangChain’s Database Integration

In a comprehensive blog post, we explored a range of LangChain use cases, one of which includes conversing with PDF documents. We emphasized how LangChain provides a robust framework for crafting GPT powered applications, inclusive of intelligent chatbots that deliver contextually aware responses based on external documents. This underscores the potential of integrating LangChain, GPT, and database technologies for a seamless user experience.

Bridging GPT and Databases

LangChain significantly improves our interaction with databases by harnessing the capabilities of Large Language Models (LLMs) like GPT. The traditional approach has been to securely store enterprise data in databases, necessitating a grasp of specific schemas for data access and manipulation. LangChain, however, innovates in this arena by introducing Database Chains and Agents, which facilitate the creation and execution of queries from natural language prompts. Regardless of whether you’re employing MySQL, PostgreSQL, Oracle SQL, Databricks, or SQLite, the adaptability of LangChain to any SQL dialect supported by SQLAlchemy demonstrates its versatility across a broad array of applications. This narrative hints at the expansive potential that the amalgamation of LangChain, GPT, and database technologies holds in revolutionizing data interaction paradigms.

How LangChain Works with Databases

LangChain can be integrated with databases to facilitate natural language interactions. Users can ask questions or make requests in plain English, and LangChain translates these into the corresponding queries. This interaction enables users, even those without technical expertise in SQL, to access and manage their database information efficiently.

Step 1: Installing Dependencies

Start by creating and activating a virtual environment. Execute the following commands:

python -m venv env
source env/bin/activate # for Ubuntu
env/Scripts/activate # for Windows

Once the virtual environment is activated, install the necessary packages:

pip install langchain langchain-experimental openai

Step 2: OpenAI API Key

LangChain is compatible with several LLMs, but using OpenAI models often yields better results.

To use LangChain with OpenAI models, you need an OpenAI key. Follow the steps below to create a new OpenAI key:

  1. Visit platform.openai.com.
  2. Click on your name or icon, located on the top right corner, and select “View API Keys” or follow this link.
  3. Click on the “Create new secret key” button to generate a new OpenAI key.
  4. We will use the API key in the next segment where we will write chat function.

Step 3: Creating a Python Script with LangChain + ChatGPT

Create a new Python file named langchain_sql_chat.py and add the following code to it:

import os
from langchain.utilities import SQLDatabase
from langchain.llms import OpenAI
from langchain_experimental.sql import SQLDatabaseChain

os.environ['OPENAI_API_KEY'] = 'OPENAI API KEY from Step 2'

db = SQLDatabase.from_uri("mssql+pyodb://username:password@hostname/database_name")

llm = OpenAI(temperature=0.2, verbose=True)
db_chain = SQLDatabaseChain.from_llm(llm, db, verbose=True)

def chat_with_sql():
    print("Type 'exit' to quit")

    while True:
        prompt = input("Enter a prompt: ")

        if prompt.lower() == 'exit':
            print('Exiting...')
            break
        else:
            try:

                print(db_chain.run(prompt))
            except Exception as e:
                print(e)

chat_with_sql()

Understanding The Code

By default, LangChain creates the OpenAI model with a temperature value of 0.7. The temperature parameter adjusts the randomness of the output. Higher values like 0.7 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. Since we’re dealing with a Database, it’s important that LLM generates factual response.

The Database

Database Connection String

The above code connects to a SQL database defined by connection string mysql://username:password@hostname/database_name

Ensure that you replace username, password, hostname, and database_name with your actual database credentials and details.

For the purpose of this article, we are using an inventory database, which is a structured repository designed to manage various aspects of stock levels in a retail environment. This database contains several tables, each serving a specific purpose in inventory management. The products table holds essential information about the items, including product ID, name, description, and price.

products.sql

CREATE TABLE products (
    product_id INT PRIMARY KEY,
    product_name VARCHAR(255) NOT NULL,
    description TEXT,
    price DECIMAL(10, 2) NOT NULL CHECK (price >= 0)
);

suppliers.sql

CREATE TABLE suppliers (
    supplier_id INT PRIMARY KEY,
    supplier_name VARCHAR(255) NOT NULL,
    contact_name VARCHAR(255),
    contact_email VARCHAR(255),
    contact_phone VARCHAR(20)
);

inventory.sql

CREATE TABLE inventory (
    inventory_id INT PRIMARY KEY,
    product_id INT,
    supplier_id INT,
    quantity_in_stock INT NOT NULL CHECK (quantity_in_stock >= 0),
    last_stocked_date DATE,
    FOREIGN KEY (product_id) REFERENCES products(product_id),
    FOREIGN KEY (supplier_id) REFERENCES suppliers(supplier_id)
);

Now that we have our code ready, run the function using:


python langchain_sql_chat.py

The following prompt should appear

Enter a prompt:

To query the database, ask questions such as “How many units of Product X are in stock

You should see the following:

Type 'exit' to quit
> Entering new SQLDatabaseChain chain...
    How many units of Product X are in stock?
    SQLQuery:SELECT quantity_in_stock FROM "inventory" WHERE product_id = (SELECT product_id FROM "products" WHERE product_name = 'Product X');
    SQLResult: [(15,)]
    Answer:There are 15 units of Product X in stock.
    > Finished chain.

More complex questions

Let’s ask a question that requires a JOIN statement.

Type 'exit' to quit
> Entering new SQLDatabaseChain chain...
    List all products with less than 10 units in stock.
    SQLQuery:SELECT product_name FROM "products" INNER JOIN "inventory" ON products.product_id = inventory.product_id WHERE quantity_in_stock < 10;
    SQLResult: [('Product A',), ('Product B',), ('Product C',)]
    Answer:Products with less than 10 units in stock are Product A, Product B, and Product C.
    > Finished chain.

LangChain was able to deduce that a “JOIN” statement is required in order to satisfy the prompt.

Conclusion

This guide serves as a foundational step to get started with LangChain and Database implementations, enabling you to interact conversationally with your database. By following the outlined steps, you can leverage the power of LangChain to make database interactions more user-friendly and efficient.

Whether you are managing an inventory database or any other type of database, LangChain, combined with GPT, translates your natural language inquiries into precise SQL queries, allowing for seamless communication with your database.

Embark on your software development adventure with Next Idea Tech! As a distinguished software development and nearshore outsourcing firm, we deliver a plethora of business solutions tailored to your needs. We are here to connect you with the cream of the crop in talent for your projects. Our mission is to pair you with seasoned professionals who are experts in their fields. We boast a team of experienced developers, testers, project managers, and marketing specialists, all poised to contribute to your success!

Let’s discuss your business aspirations! We extend complimentary consultations to understand your project better.

Skills

Posted on

September 22, 2023