Python generator

Last modified: June 22, 2025

The Python generator creates data using an IronPython script. In the Fill settings section, you can set up basic and custom settings.

Custom settings

IronPython script text box: Specify the script in the text box.

Column: Insert a column into the script.

Function: Insert a function into the script.

Python generator

Using an IronPython script

You can use an IronPython script to generate data. A script must define a main() function that takes a single config argument.

config is a dictionary containing keys that you can use in your scripts.

You can specify values for the following keys:

  • config[“column_type”] that is the column datatype
  • config[“column_size”] that is the column size
  • config[“n_rows”] that is the number of rows
  • config[“seed”] that is the current random seed
  • config[“config_path”] that is the path to the meaningful generators folder
  • column_name refers to the column name. Specifying the config argument before column_name is not obligatory. It can be called directly.

Using regular expressions

The Python generator supports the usage of the RegexGenerator inner class. Therefore, you can use regular expressions within the Python generator.

You can create the RegexGenerator in three ways:

mygen = RegexGenerator(regular expression)

mygen = RegexGenerator(regular expression, is unique data, data length)

mygen = RegexGenerator(regular expression, is unique data, data length, seed)

WHERE:

  • regular expression: Define an actual regular expression, for example, “[0-9A-Z]+”.
  • is unique data: Define whether to generate unique data. Available parameters are False and True.
  • data length: Specify the length of the largest value to be generated. It can be a digit value or an expression set in the following way: config[“column_size”].
  • seed: Define a seed value. It can be a digit value or an expression set in the following way: config[“seed”].

Preview of the column data generated by the Python generator

Python generator

Usage example

mygen = RegexGenerator("[0-9A-Z]+",False, 30)
 
def main():
    i = 0
    while i <= config["n_rows"]:
    i += 1
    varData = mygen.Generate()
    yield varData

Examples of Python Scripts

Sequential read of rows from a text file:

# Generator function to yield each color line from the file
def getColors():
    fileName = config["config_path"] + '\\' + "Colors.txt"
    with open(fileName, 'r') as f:
        content = f.readlines()
        for row in content:
            yield row.strip()  

def main(config):
    return getColors()

if __name__ == "__main__":
    for color in main(config):
        print(color)

Random read from a text file:

import random

# Uncomment the line below to use a seed if needed
# random.seed(config["seed"])

def getColors():
    fileName = config["config_path"] + '\\' + "Colors.txt"
    with open(fileName, 'r') as f:
        content = f.readlines()
        for row in content:
            yield row.strip()

def main(config):
    colors = list(getColors())
    while True:
        yield random.choice(colors)

if __name__ == "__main__":
    generator = main(config)
    for _ in range(5):
        print(next(generator))

Sequential read of data for a column from a CSV file:

import csv

def getCountryCodes():
    columnName = 'ISO3166-1-Alpha-2'
    fileName = config["config_path"] + '\\' + "CountryCodes.csv"
    with open(fileName, "rb") as file:
        reader = csv.DictReader(file, delimiter=';', quotechar='"')
        for row in reader:
            yield row[columnName].strip().upper()

def main(config):
    return getCountryCodes()

# Example usage
if __name__ == "__main__":
    for code in main(config):
        print(code)

Random read of data for a column from a CSV file:

import csv
import random

# Uncomment the line below to use a seed if needed
# random.seed(config["seed"])

def getCountryCodes():
    columnName = 'ISO3166-1-Alpha-2'
    fileName = config["config_path"] + '\\' + "CountryCodes.csv"
    with open(fileName, "rb") as file:
        reader = csv.DictReader(file, delimiter=';', quotechar='"')
        for row in reader:
            yield str(row[columnName]).strip().upper()

def main(config):
    codes = list(getCountryCodes())
    while True:
        yield random.choice(codes)  


if __name__ == "__main__":
    generator = main(config)
    for _ in range(5):
        print(next(generator))

Sequential read of rows from a XML file:

# Read an XML file
# Use the CLR XML libraries
 
import clr

clr.AddReference("System.Xml")
from System.Xml.XPath import XPathDocument, XPathNavigator
 
def getTitles(column_size=50):
    filename = r"C:\PyGen\books.xml"
    doc = XPathDocument(filename)
    nav = doc.CreateNavigator()
    expr = nav.Compile("/catalog/book/title")
    titles = nav.Select(expr)
    for title in titles:
        yield str(title)[:column_size]
 
def main(config):
    # Truncate titles to the column size
    return list(getTitles(column_size=config["column_size"]))

Generation of male and female names from files depending on a flag:

import random


# Uncomment the line below to use a seed if needed
# random.seed(config["seed"])

def getPersonNames(fileName):
    fileName = config["config_path"] + '\\' + fileName
    with open(fileName, "rb") as f:
        content = f.readlines()
        for row in content:
            yield row.strip()

def main(config):
    males = list(getPersonNames("FirstNamesMale.txt"))
    females = list(getPersonNames("FirstNamesFemale.txt"))
    while True:
        # Randomly decide gender (True for male, False for female)
        is_male = random.choice([True, False])
        if is_male:
            yield random.choice(males)
        else:
            yield random.choice(females)

if __name__ == "__main__":
    generator = main(config)
    for _ in range(5):
        print(next(generator))

-- Running a certain generator depending on a flag:
def main(config):
    maleGen = RegexGenerator("Automotive|Computers|Crafts|Tools")
    femaleGen = RegexGenerator("Furniture|Pharmacy|Garden|Gifts")
    while True:
        # Random gender flag for demonstration (replace with your own logic)
        is_male = random.choice([True, False])
        if is_male:
            yield maleGen.Generate()
        else:
            yield femaleGen.Generate()

Running a certain generator depending on a flag:

# Select a generator depending on a flag
 
def main(config):
    maleGen = RegexGenerator("Automotive|Computers|Crafts|Tools")
    femaleGen = RegexGenerator("Furniture|Pharmacy|Garden|Gifts")
    while True:
        # is_male - is a flag field
        if is_male:
        yield maleGen.Generate()
    else:
        yield femaleGen.Generate()

Calculating a new date basing on a date value from other field:

import random
import clr
from System import DBNull

# Uncomment the line below to use a seed if needed
# random.seed(config["seed"])

def main(config):
    # Safely get StartDate, returns NULL if not found
    StartDate = getattr(config, 'StartDate', None)

    if not StartDate or str(StartDate) == '':
        return DBNull.Value

    # Add a random number of days to StartDate
    return StartDate.AddDays(random.randint(1, 1000))