Generating Frontend and Backend Knowledge Bases for Your Custom GPT

Blog

Posted by Nuno Marques on 20 Dec 2024

Custom GPTs are powerful tools that can elevate productivity and collaboration in software development. By creating a structured knowledge base, you can train a custom GPT to act as an insightful programming assistant. In this post, I'll walk you through a simple Node.js script I developed to generate .txt files containing all relevant source files for both the frontend and backend of a project. These files serve as the foundation for training a custom GPT, making it a great pair programming buddy.


The Script: Building the Knowledge Base

Below is the Node.js script that generates .txt files for the frontend and backend directories. It processes all relevant project files on the project, such as .js, .ts, .tsx, .mjml, and .sql, formatting their content for easy reference.

const fs = require('fs');
const path = require('path');
const ignore = require('ignore');

const generateFileList = (baseDir, outputFile) => {
    const gitignorePath = path.join(baseDir, '.gitignore');
    const ig = ignore();

    // Load .gitignore rules
    if (fs.existsSync(gitignorePath)) {
        const gitignoreContent = fs.readFileSync(gitignorePath, 'utf8');
        ig.add(gitignoreContent);
    }

    const files = [];

    const traverseDirectory = (dir) => {
        const items = fs.readdirSync(dir, { withFileTypes: true });

        for (const item of items) {
            const itemPath = path.join(dir, item.name);
            const relativePath = path.relative(baseDir, itemPath);

            // Skip node_modules explicitly
            if (item.name === 'node_modules' || ig.ignores(relativePath)) continue;

            if (item.isDirectory()) {
                traverseDirectory(itemPath);
            } else {
                files.push(relativePath);
            }
        }
    };

    traverseDirectory(baseDir);

    // Filter and format files
    const sourceFiles = files
        .filter((file) => file.endsWith('.js') || file.endsWith('.ts') || file.endsWith('.tsx') || file.endsWith('.mjml') || file.endsWith('.sql'))
        .map((filePath) => {
            const content = fs.readFileSync(path.join(baseDir, filePath), 'utf8');
            return `
------------------------------------------------------------------------
File: ${filePath}
------------------------------------------------------------------------
${content}`;
        });

    // Write output to file
    fs.writeFileSync(outputFile, sourceFiles.join('\n\n'), 'utf8');
    console.log(`Generated ${outputFile}`);
};

const main = () => {
    const frontendDir = path.join(__dirname, 'frontend');
    const backendDir = path.join(__dirname, 'backend');

    if (!fs.existsSync(frontendDir) || !fs.existsSync(backendDir)) {
        console.error('Ensure both "frontend" and "backend" directories exist in the same location as this script.');
        process.exit(1);
    }

    generateFileList(frontendDir, 'frontend_files.txt');
    generateFileList(backendDir, 'backend_files.txt');
};

main();

How It Works

  1. File Traversal: The script recursively scans through the directories frontend and backend.
  2. Exclusion Rules: It respects .gitignore rules to skip files that shouldn't be included.
  3. Content Compilation: It reads the content of eligible files and formats it with headers for easy identification.
  4. Output: The results are saved into two .txt files: frontend_files.txt and backend_files.txt.

Example Output of Generated .txt Files

Once the script runs, it creates .txt files for both your frontend and backend directories. These files include all relevant source files in your codebase, formatted for easy reference. Below is an example of what the output might look like:

Sample Output: backend_files.txt

-----------------------------------------------------------------------
File: app.js
-----------------------------------------------------------------------
import bodyParser from 'body-parser';
import express from 'express';
import corsMiddleware from './middlewares/cors.js';
import categoryRoutes from './routes/categories.js';
import collectionsRoutes from './routes/collections.js';
import recipesRoutes from './routes/recipes.js';
import userRoutes from './routes/users.js';

export const app = express();

app.use(bodyParser.json());
app.use(corsMiddleware);
app.use(express.static('public'));
app.use('/categories', categoryRoutes)
app.use('/collections', collectionsRoutes);
app.use('/recipes', recipesRoutes);
app.use('/users', userRoutes);

export default app;

-----------------------------------------------------------------------
File: config/firebase.js
-----------------------------------------------------------------------
import admin from 'firebase-admin';
import { serviceAccount } from './firebaseKey.js';

admin.initializeApp({
  credential: admin.credential.cert(serviceAccount),
});

const firestore = admin.firestore();
export { admin, firestore };

(...)

Why Use Custom GPTs as Pair Programming Buddies?

Using a custom GPT trained with project-specific knowledge has several benefits:

1. On-Demand Documentation

The .txt files provide an easy-to-query knowledge base, making GPT capable of answering questions about your codebase instantly.

2. Code Review Assistance

Custom GPTs can analyze patterns in your code and offer suggestions or identify inconsistencies.

3. Feature Implementation Guidance

Need help adding a feature? A custom GPT trained on your codebase can act as a guide by leveraging its understanding of your project structure and source files.

4. Boosted Productivity

No more hunting for files or details in a sprawling codebase. Your GPT can retrieve relevant snippets, saving precious development time.


Leveraging External AI Resources

For developers interested in exploring AI tools further, WiseDuckDev offers a comprehensive portfolio of AI-driven solutions tailored for web and mobile development. Their expertise in integrating AI into development workflows can provide valuable insights and tools to enhance your projects.

Additionally, WiseDuckDev GPTs presents a vast library of custom GPTs designed to assist developers across various domains, including web development, AI, blockchain, and more. Exploring these resources can offer practical examples and tools to complement your development process.


Usage Example: Training a GPT with the Files

After generating the .txt files, use them to train a custom GPT. Tools like OpenAI's fine-tuning or embeddings-based search methods allow you to feed these files into your GPT for enhanced contextual awareness.


Conclusion

Creating .txt knowledge bases for your custom GPT is an excellent way to unlock its full potential. With this simple script, you can generate a comprehensive overview of your frontend and backend source files, empowering your GPT to act as an effective pair programming buddy.

Happy coding!