Python Generator
Last modified: October 8, 2024
The Python Generator generates data using an IronPython script.
The dialog box of the Python Generator contains basic settings and custom settings. You can customize the generator by changing the basic and custom settings.
Custom settings
IronPython script text box
Specify your script in the text box.
Use an IronPython script
You can use an IronPython script to generate data. A script must define a main() function that takes a single config argument.
config is a dictionary containing keys you can use in your scripts.
You can specify values for the following keys:
- config[“column_type”] - is the column datatype
- config[“column_size”] - is the column size
- config[“n_rows”] - is the number of rows
- config[“seed”] - is the current random seed
- config[“config_path”] - is the path to the meaningful generators folder
- column_name - is the column name. Specifying the config argument before column_name is not obligatory. It can be called directly.
Use regular expressions
The Python Generator supports the usage of the RegexGenerator inner class. Therefore, you can use regular expressions within the Python Generator. You can create the RegexGenerator in the following three ways:
mygen = RegexGenerator(regular expression)
mygen = RegexGenerator(regular expression, is unique data, data length)
mygen = RegexGenerator(regular expression, is unique data, data length, seed)
Where:
- regular expression - an actual regular expression, e.g. “[0-9A-Z]+”
- is unique data - defines whether to generate unique data. Available parameters are False and True
- data length - specifies the length of the largest value to be generated. It can be a digit value or an expression set in the following way: config[“column_size”]
- seed: - a seed value. It can be a digit value or an expression set in the following way: config[“seed”]
Usage example
mygen = RegexGenerator("[0-9A-Z]+",False, 30)
def main():
i = 0
while i <= config["n_rows"]:
i = i + 1
varData = mygen.Generate();
yield varData
main()
Examples of Python scripts
Sequential read of rows from a text file:
# Read a txt file sequentially
def getColors():
fileName = config["config_path"] +'\\'+ r"Colors.txt"
with open(fileName) as f:
content = f.readlines()
for row in content:
yield row
def main(config):
return getColors()
Random read from a text file:
# Read a txt file randomly
import random
# Uncomment the line below to use a seed if needed
#random.seed(config["seed"])
def getColors():
fileName = config["config_path"] +'\\'+ r"Colors.txt"
with open(fileName) as f:
content = f.readlines()
for row in content:
yield row
def main(config):
colors = list(getColors())
while True:
yield colors[random.randint(0, len(colors)-1)]
Sequential read of data for a column from a CSV file:
# Read a csv file sequentially
import csv
def getCountryCodes():
columnName = 'ISO3166-1-Alpha-2'
fileName = config["config_path"] +'\\'+ r"CountryCodes.csv"
with open(fileName,"rb") as file:
reader = csv.DictReader(file, delimiter=';', quotechar='"')
for row in reader:
yield str(row[columnName]).upper()
def main(config):
return getCountryCodes()
Random read of data for a column from a CSV file:
# Read a csv file sequentially
import csv
import random
# Uncomment the line below to use a seed if needed
#random.seed(config["seed"])
def getCountryCodes():
columnName = 'ISO3166-1-Alpha-2'
fileName = config["config_path"] +'\\'+ r"CountryCodes.csv"
with open(fileName,"rb") as file:
reader = csv.DictReader(file, delimiter=';', quotechar='"')
for row in reader:
yield str(row[columnName]).upper()
def main(config):
codes = list(getCountryCodes())
while True:
yield codes[random.randint(0, len(codes)-1)]
Sequential read of rows from a XML file:
# Read an XML file
# Use the CLR XML libraries
import clr
clr.AddReference("System.Xml")
from System.Xml.XPath import XPathDocument, XPathNavigator
def getTitles(column_size=50):
filename = r"D:\books.xml"
doc = XPathDocument(filename)
nav = doc.CreateNavigator()
expr = nav.Compile("/catalog/book/title")
titles = nav.Select(expr)
for title in titles:
yield str(title)[:column_size]
def main(config):
# Truncate titles to the column size
return list(getTitles(column_size=config["column_size"]))
Generation of male and female names from files depending on a flag:
# Select a name depending on a flag
import random
# Uncomment the line below to use a seed if needed
#random.seed(config["seed"])
def getPersonNames(fileName):
fileName = config["config_path"] +'\\'+ fileName
with open(fileName) as f:
content = f.readlines()
for row in content:
yield row
def main(config):
males = list(getPersonNames(r"FirstNamesMale.txt"))
famels= list(getPersonNames(r"FirstNamesFemale.txt" ))
while True:
# is_male - is a flag field
if is_male:
yield males[random.randint(0, len(males)-1)]
else:
yield famels[random.randint(0, len(famels)-1)]
Launch of a certain generator depending on a flag:
# Select a generator depending on a flag
def main(config):
maleGen = RegexGenerator("Automotive|Computers|Crafts|Tools")
femaleGen = RegexGenerator("Furniture|Pharmacy|Garden|Gifts")
while True:
# is_male - is a flag field
if is_male:
yield maleGen.Generate()
else:
yield femaleGen.Generate()
Calculation of a new date basing on a date value from other field:
# Calculate a date based on another date
import random
# Uncomment the line below to use a seed if needed
#random.seed(config["seed"])
def main(config):
# StartDate - is a column name
if str(StartDate) == '':
return DBNull.Value
# Add number of days to a StartDate
return StartDate.AddDays(random.randint(1, 1000));
Want to find out more?
Overview
Take a quick tour to learn all about the key benefits delivered by Data Generator for SQL Server.
All features
Get acquainted with the rich features and capabilities of the tool in less than 5 minutes.
Request a demo
If you consider employing this tool for your business, request a demo to see it in action.