Create Dummy Structured Data

Context:

The purpose of this notebook is to generate synthetic tabular data that you can use for the Rag with structured and unstructed data workshop.

This notebook, run predefined python scripts in pythonScripts folder to generate dummy data. The generated data is saved in four csv files inside sds folder. SDS here means sythetic dataset.

!pip install faker
<h2>Execute the files in Directory:</h2>
import os

files = os.listdir('pythonScripts')

directory = 'sds'
if not os.path.exists(directory):
    print(f'directory not found, creating {directory} directory')
    os.makedirs(directory)

for file_name in files:
    if file_name.endswith('.py'):
        print(f"Running {file_name}")
        %run pythonScripts/{file_name}

End