Python script examples
Last modified: October 8, 2024
The topic provides the examples of IronPython scripts, which you can use to generate data with the Python script generator. You can use these scripts examples as a template for creating your own scripts.
AGE
The script generates life years based on the date of birth and date of death.
<?xml version="1.0" encoding="utf-8"?>
<Generators xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Generator>
<Category>Personal</Category>
<Name>Age</Name>
<ShortName>Age</ShortName>
<Description>generate number of years that are dependent on date of birth and date of death</Description>
<Type>SqlPythonGenerator</Type>
<ColumnTypes>
<Type>int</Type>
<Type>smallint</Type>
<Type>tinyint</Type>
</ColumnTypes>
<Matches>
<Match table=".*" column="^AGE$" priority="100" />
</Matches>
<Options>
<Expression># The generator calculates the age based on the date of birth and date of death.
import random
from System import *
from System import DateTime
from System import DBNull
from System.Globalization import CultureInfo
def main():
# DOB - name of the column in the current table that contains dates of birth.
# This name should be replaced with the actual name of the column of your table, for example DAYOFBIRTHDAY.
# DOD - name of the column in the current table that contains dates of death.
# This name should be replaced with the actual name of the column of your table, for example DAYOFDEATH.
bd = DOB
dd = DOD
if str(bd) == '' or bd == DBNull.Value or str(dd) == '' or dd == DBNull.Value:
return 0
birthdate = ''
deathdate = ''
if type(bd).__name__ == 'str':
birthdate = DateTime.ParseExact(bd,"dd/MM/yyyy", CultureInfo.InvariantCulture)
else: # else it's supposed to be DateTime
birthdate = bd
if type(dd).__name__ == 'str':
deathdate = DateTime.ParseExact(dd,"dd/MM/yyyy HH:mm:ss.fff", CultureInfo.InvariantCulture)
else: # else it's supposed to be DateTime
deathdate = dd
if deathdate < birthdate:
return 0
# Calculate the age based on the date of birth and date of death.
return Convert.ToInt32(Math.Truncate((deathdate - birthdate).TotalDays / 365))
</Expression>
<AllowNulls>false</AllowNulls>
</Options>
</Generator>
</Generators>
DEATH DATE
The generator generates dates of death based on the dates of birth with the possibility to set lifespan that depending on the weight list.
<?xml version="1.0" encoding="utf-8"?>
<Generators xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Generator>
<Category>Personal</Category>
<Name>Death Date</Name>
<ShortName>Death Date</ShortName>
<Description>fill with dates of death with different probability</Description>
<Type>SqlPythonGenerator</Type>
<ColumnTypes>
<Type>date</Type>
<Type>datetime</Type>
<Type>datetime2</Type>
<Type>smalldatetime</Type>
</ColumnTypes>
<Matches>
<Match table=".*" column="^Death.*Date$" priority="90" />
<Match table=".*" column="^Date.*Death$" priority="90" />
<Match table=".*" column="^Day.*Death$" priority="90" />
<Match table=".*" column=".*dead" priority="10" />
<Match table=".*" column="^DOD$" priority="20" />
<Match table=".*" column="^ddate$" priority="30" />
<Match table=".*" column="^dday$" priority="10" />
</Matches>
<Options>
<Expression>#This generator generates dates of death basing on the dates of birth with the option to set lifespan that is based on the weight list.
#It has been made by request of our user from forum http://forums.devart.com/viewtopic.php?f=23&t=33396
import random
from System import *
from System import DateTime
from System import DBNull
from System.Globalization import CultureInfo
def main():
# DOB - name of the column in the current table, that contains dates of birth.
# This name should be replaced with the actual name of the column of your table, for example DAYOFBIRTHDAY.
bd = DOB
if str(bd) == '':
return DBNull.Value
# "dd/MM/yyyy" - is the format of date of birth. If it differs, make the following modifications.
bdFormat = "dd/MM/yyyy"
x = random.randint(1, 100)
deltaDays = 0
# Setting of percentage ranges for ages. The sum of all ranges should be equal to 100%.
if x <= 5: # 5 %
deltaDays = random.randint(1, 3650) # 5 % - death at age 1..3650 days (1-10 years)
elif x > 5 and x <= 20: # 15 %
deltaDays = random.randint(3650, 18250) # 15 % - death at age 3650..18250 days (10-50 years)
else: # 80 %
deltaDays = random.randint(18250, 36500) # 80 % - death at age 18250..36500 days (50-100 years)
birthdate = ''
# Check of column type for the date of birth column (if it is a string column, the conversion to the DATE format should be performed).
if type(bd).__name__ == 'str':
birthdate = DateTime.ParseExact(bd,bdFormat, CultureInfo.InvariantCulture)
else: # else it's supposed to be DateTime
birthdate = bd
# Addition of lifespan (in days).
deathDate = birthdate.AddDays(deltaDays);
# If the date of death is bigger than the current date, the random quantity of days should be subtracted from the current date.
if deathDate >= DateTime.Now:
deathDate = DateTime.Now.AddDays(-random.randint(1, 300));
if deathDate < birthdate:
deathDate = birthdate
return deathDate.Date
</Expression>
<AllowNulls>false</AllowNulls>
</Options>
</Generator>
</Generators>
DATA UNIVERSAL NUMBERING SYSTEM NUMBER (DUNS Number)
The script generates DUNS numbers using the Luhn algorithm.
<?xml version="1.0" encoding="utf-8"?>
<Generators xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Generator>
<Category>Business</Category>
<Name>Data Universal Numbering System Number (DUNS Number)</Name>
<ShortName>DUNS Number</ShortName>
<Description>481795235, 796124832, 213794852, 873491252, ...</Description>
<Type>SqlPythonGenerator</Type>
<ColumnTypes>
<Type>varchar</Type>
<Type>char</Type>
<Type>nvarchar</Type>
<Type>nchar</Type>
<Type>bigint</Type>
<Type>int</Type>
</ColumnTypes>
<Matches>
<Match table=".*" column="^DUNS$" priority="100" minlen="9" maxlen="10" />
<Match table=".*" column="DUNS.*code" priority="60" minlen="9" maxlen="10" />
</Matches>
<Options>
<Expression># This script generates DUNS numbers using the Luhn algorithm
import random
random.seed(config["seed"])
# Calculate checksum by mod 10
def luhn_checksum(input_number):
def digits_of(n):
return [int(d) for d in str(n)]
digits = digits_of(input_number)
odd_digits = digits[-1::-2]
even_digits = digits[-2::-2]
checksum = 0
checksum += sum(odd_digits)
for d in even_digits:
checksum += sum(digits_of(d*2))
return checksum % 10
def calculate_luhn(partial_input_number):
check_digit = luhn_checksum(int(partial_input_number) * 10)
return check_digit if check_digit == 0 else 10 - check_digit
def GetDigit():
value = random.sample(range(1, 10), 8)
return ''.join(str(x) for x in value)
def main():
i = 0
while i <= config["n_rows"]:
i = i + 1
d = GetDigit()
yield str(d) + str(calculate_luhn(d))
main()</Expression>
</Options>
</Generator>
<Generator>
<Category>Business</Category>
<Name>Data Universal Numbering System Number (DUNS Number)</Name>
<ShortName>DUNS Number</ShortName>
<Description>48-179-5235, 79-612-4832, 21-379-4852, 87-349-1252, ...</Description>
<Type>SqlPythonGenerator</Type>
<ColumnTypes>
<Type>varchar</Type>
<Type>char</Type>
<Type>nvarchar</Type>
<Type>nchar</Type>
<Type>text</Type>
<Type>ntext</Type>
<Type>varchar(max)</Type>
<Type>nvarchar(max)</Type>
</ColumnTypes>
<Matches>
<Match table=".*" column="^DUNS$" priority="100" minlen="11" />
<Match table=".*" column="DUNS.*code" priority="60" minlen="11" />
</Matches>
<Options>
<Expression># This script generates DUNS numbers using the Luhn algorithm
import random
random.seed(config["seed"])
def luhn_checksum(input_number):
def digits_of(n):
return [int(d) for d in str(n)]
digits = digits_of(input_number)
odd_digits = digits[-1::-2]
even_digits = digits[-2::-2]
checksum = 0
checksum += sum(odd_digits)
for d in even_digits:
checksum += sum(digits_of(d*2))
return checksum % 10
def calculate_luhn(partial_input_number):
check_digit = luhn_checksum(int(partial_input_number) * 10)
return check_digit if check_digit == 0 else 10 - check_digit
def GetDigit():
value = random.sample(range(1, 10), 8)
return ''.join(str(x) for x in value)
def formatOutPut(value):
s = '{0}{1}-{2}{3}{4}-{5}{6}{7}{8}'
return s.format(*list(value))
def main():
i = 0
while i <= config["n_rows"]:
i = i + 1
d = GetDigit()
yield formatOutPut(str(d) + str(calculate_luhn(d)))
main()</Expression>
</Options>
</Generator>
</Generators>
URL (Gravatar image)
The script retrieves a Gravatar image url based on an email address.
<?xml version="1.0" encoding="utf-8"?>
<Generators xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Generator>
<Category>Personal</Category>
<Name>URL (Gravatar image)</Name>
<ShortName>URL (Gravatar img)</ShortName>
<Description>http://www.gravatar.com/avatar/6ffc45896e63ba040e769b8b9d150107?s=40&d=http%3A%2F%2Fwww.example.com%2Fdefault.jpg ;...</Description>
<Type>SqlPythonGenerator</Type>
<ColumnTypes>
<Type>varchar</Type>
<Type>char</Type>
<Type>nvarchar</Type>
<Type>nchar</Type>
<Type>text</Type>
<Type>ntext</Type>
<Type>varchar(max)</Type>
<Type>nvarchar(max)</Type>
</ColumnTypes>
<Matches>
<Match table=".*" column="^gravatar$" priority="100" minlen="150" />
</Matches>
<Options>
<Expression># This script retrieves a Gravatar image url based on an email address
import urllib, hashlib
# Set your variables here
email = "my_gravatar_email@example.com"
# email = email_column_name
default = "http://www.example.com/default.jpg"
size = 40
# construct the url
gravatar_url = "http://www.gravatar.com/avatar/" + hashlib.md5(email.lower()).hexdigest() + "?"
gravatar_url += urllib.urlencode({'d':default, 's':str(size)})
gravatar_url</Expression>
</Options>
</Generator>
</Generators>
INTEGRATED CIRCUIT CARD ID (ICCID)
The script generates primary account numbers that have 19 digits long.
<?xml version="1.0" encoding="utf-8"?>
<Generators xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Generator>
<Category>Personal</Category>
<Name>Integrated Circuit Card ID (ICCID)</Name>
<ShortName>ICCID</ShortName>
<Description>8938605388545996967, 8937103007991618431, 8935602817534397104, ...</Description>
<Type>SqlPythonGenerator</Type>
<ColumnTypes>
<Type>varchar</Type>
<Type>char</Type>
<Type>nvarchar</Type>
<Type>nchar</Type>
<Type>bigint</Type>
<Type>decimal</Type>
</ColumnTypes>
<Matches>
<Match table=".*" column="^ICCID$" priority="105" minlen="19" />
</Matches>
<Options>
<Expression>import random
import csv
random.seed(config["seed"])
# Calculates luhn checksum
def luhn_checksum(input_number):
def digits_of(n):
return [int(d) for d in str(n)]
digits = digits_of(input_number)
odd_digits = digits[-1::-2]
even_digits = digits[-2::-2]
checksum = 0
checksum += sum(odd_digits)
for d in even_digits:
checksum += sum(digits_of(d*2))
return checksum % 10
def GetDigit(codes):
# Major industry identifier
MII = '89'
# Country code
CC = codes[random.randint(1, len(codes)-1)]
# Issuer identifier
randoms=[]
for i in range (4- len(CC)):
randoms.append(0)
II = ''.join(str(x) for x in randoms) + str(random.randint(1,5))
value = MII + CC + II
# Individual account identification
randoms=[]
for i in range (18- len(value)):
randoms.append(random.randint(0,9))
IAI = ''.join(str(x) for x in randoms)
value = value + IAI
return value
# Getting list of telephone country codes from file CountryCodes.csv
def countryCodes():
var_columnName = 'country_dial'
var_fileName = config["config_path"] +'\\'+ r"CountryCodes.csv"
with open(var_fileName,"rb") as csvfile:
var_reader = csv.DictReader(csvfile, delimiter=';', quotechar='"')
for row in var_reader:
yield str(row[var_columnName]).upper()
def main():
random.seed(config["seed"])
codes = list(countryCodes())
i = 0
while i <= config["n_rows"]:
i = i + 1
d = GetDigit(codes)
yield d + str(luhn_checksum(d))
main()
</Expression>
<AllowNulls>false</AllowNulls>
</Options>
</Generator>
</Generators>
LANGUAGE CODE (ISO 639-1 alpha-2, UpperCase)
The script reads data from a csv file, converts it to uppercase, and shuffles data in the list.
<?xml version="1.0" encoding="utf-8"?>
<Generators xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Generator>
<Category>Product</Category>
<Name>Language Code (ISO 639-1 alpha-2, UpperCase)</Name>
<ShortName>Language Code 2 (u)</ShortName>
<Description>ES, FR, ZH, UK, TH, ZU, EN, JA, NL, ...</Description>
<Type>SqlPythonGenerator</Type>
<ColumnTypes>
<Type>varchar</Type>
<Type>char</Type>
<Type>nvarchar</Type>
<Type>nchar</Type>
<Type>text</Type>
<Type>ntext</Type>
<Type>varchar(max)</Type>
<Type>nvarchar(max)</Type>
</ColumnTypes>
<Matches>
<Match table="Language" column="Lang.*id$" priority="21" minlen="2" />
</Matches>
<Options>
<Expression># This script reads data from a csv file, converts it to uppercase, and shuffles data in the list
import csv
import random
random.seed(config["seed"])
var_fileName = config["config_path"] +'\\'+ r"LanguageCodes.csv"
var_columnName = 'ISO639-1-Alpha-2'
def values():
with open(var_fileName,"rb") as csvfile:
var_reader = csv.DictReader(csvfile, delimiter=';', quotechar='"')
for row in var_reader:
yield str(row[var_columnName]).upper()
var_lst = list(values())
random.shuffle(var_lst)
var_lst</Expression>
</Options>
</Generator>
</Generators>
Language Code (ISO 639-1 alpha-3, UpperCase)
The script reads data from a csv file, converts it to uppercase, and shuffles data in the list.
<?xml version="1.0" encoding="utf-8"?>
<Generators xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Generator>
<Category>Product</Category>
<Name>Language Code (ISO 639-1 alpha-3, UpperCase)</Name>
<ShortName>Language Code 3 (u)</ShortName>
<Description>SPA, FRE, CHI, UKR, THA, ZUL, ENG, JPN, DUT, ...</Description>
<Type>SqlPythonGenerator</Type>
<ColumnTypes>
<Type>varchar</Type>
<Type>char</Type>
<Type>nvarchar</Type>
<Type>nchar</Type>
<Type>text</Type>
<Type>ntext</Type>
<Type>varchar(max)</Type>
<Type>nvarchar(max)</Type>
</ColumnTypes>
<Matches>
<Match table="Language" column="Lang.*id$" priority="31" minlen="3" />
</Matches>
<Options>
<Expression># This script reads data from a csv file, converts it to uppercase, and shuffles data in the list
import csv
import random
random.seed(config["seed"])
var_fileName = config["config_path"] +'\\'+ r"LanguageCodes.csv"
var_columnName = 'ISO639-1-Alpha-3b'
def values():
with open(var_fileName,"rb") as csvfile:
var_reader = csv.DictReader(csvfile, delimiter=';', quotechar='"')
for row in var_reader:
yield str(row[var_columnName]).upper()
var_lst = list(values())
random.shuffle(var_lst)
var_lst</Expression>
</Options>
</Generator>
</Generators>
LOREM PIXEL
The script dynamically generates placeholder images.
<?xml version="1.0" encoding="utf-8"?>
<Generators xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Generator>
<Category>IT</Category>
<Name>Lorem Pixel</Name>
<ShortName>Lorem Pixel</ShortName>
<Description>dynamically generates placeholder images (!time-consuming!)</Description>
<Type>SqlPythonGenerator</Type>
<ColumnTypes>
<Type>varbinary(max)</Type>
<Type>image</Type>
</ColumnTypes>
<Matches>
</Matches>
<Options>
<Expression>import clr
clr.AddReference("System.Net")
from System.Net import WebClient
import clr
clr.AddReference("System.IO")
from System.IO import MemoryStream, Stream
import clr
clr.AddReference("System.Drawing")
from System.Drawing import Image
clr.AddReference("System")
from System import Random
random = Random()
categories = ['abstract', 'animals', 'business', 'cats', 'city', 'food', 'nightlife' , 'fashion', 'people', 'nature', 'sports', 'technics', 'transport']
def getUrl():
varCategoryIndex = random.Next(len(categories))
varCategoryName =categories[varCategoryIndex]
varUrlTemplate = 'http://lorempixel.com/100/100/{0}/{1}'
varUrl = varUrlTemplate.format(varCategoryName,varCategoryName)
return varUrl
def main():
i = 0
while i <= config["n_rows"]:
webClient = WebClient()
dataImage = webClient.DownloadData(getUrl())
stream = MemoryStream(dataImage)
webClient.Dispose()
yield Image.FromStream(stream)
stream.Dispose()
main()</Expression>
</Options>
</Generator>
</Generators>
MD5 HASH
The script generates MD5 hash based on values in some string or a table column.
<?xml version="1.0" encoding="utf-8"?>
<Generators xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Generator>
<Category>IT</Category>
<Name>MD5 Hash</Name>
<ShortName>MD5 Hash</ShortName>
<Description>605669cab962bf944d99ce89cf9e58d9, ee3c6e8ed9c27c45f161ea416c997df8, 8625e1de7be14c39b1d14dc03d822497, ...</Description>
<Type>SqlPythonGenerator</Type>
<ColumnTypes>
<Type>varchar</Type>
<Type>char</Type>
<Type>nvarchar</Type>
<Type>nchar</Type>
<Type>text</Type>
<Type>ntext</Type>
<Type>varchar(max)</Type>
<Type>nvarchar(max)</Type>
</ColumnTypes>
<Matches>
</Matches>
<Options>
<Expression># This script generates MD5 hash based on values in some string or a table column
import hashlib
def HashIt():
# You can also use 'sha1', 'sha256', 'sha384', 'sha512'
m = hashlib.md5()
# You can use a column_name instead of 'Some Value'
m.update('Some Value')
return m.hexdigest()
HashIt()</Expression>
</Options>
</Generator>
</Generators>
For detailed instructions on how to generate related data elements, refer to our blog post.
Want to find out more?
Overview
Take a quick tour to learn all about the key benefits delivered by Data Generator for SQL Server.
All features
Get acquainted with the rich features and capabilities of the tool in less than 5 minutes.
Request a demo
If you consider employing this tool for your business, request a demo to see it in action.