Python script examples

The topic provides the examples of IronPython scripts, which you can use to generate data with the Python script generator. You can use these scripts examples as a template for creating your own scripts.

AGE

The script generates life years based on the date of birth and date of death.

<?xml version="1.0" encoding="utf-8"?>
<Generators xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Generator>
    <Category>Personal</Category>
    <Name>Age</Name>
    <ShortName>Age</ShortName>
    <Description>generate number of years that are dependent on date of birth and date of death</Description>
    <Type>SqlPythonGenerator</Type>
    <ColumnTypes>
      <Type>int</Type>
      <Type>smallint</Type>
      <Type>tinyint</Type>
    </ColumnTypes>
    <Matches>
      <Match table=".*" column="^AGE$" priority="100" />
    </Matches>
    <Options>
      <Expression># The generator calculates the age based on the date of birth and date of death.

import random
from System import *
from System import DateTime
from System import DBNull
from System.Globalization import CultureInfo

def main(): 

# DOB - name of the column in the current table that contains dates of birth.
# This name should be replaced with the actual name of the column of your table, for example DAYOFBIRTHDAY.

# DOD - name of the column in the current table that contains dates of death. 
# This name should be replaced with the actual name of the column of your table, for example DAYOFDEATH.
  
bd = DOB
  dd = DOD

  if str(bd) == '' or bd == DBNull.Value or str(dd) == '' or dd == DBNull.Value:
   return 0

  birthdate = ''
  deathdate = ''

if type(bd).__name__ == 'str':
     birthdate = DateTime.ParseExact(bd,"dd/MM/yyyy", CultureInfo.InvariantCulture)
  else: # else it's supposed to be DateTime
     birthdate = bd

if type(dd).__name__ == 'str':
     deathdate = DateTime.ParseExact(dd,"dd/MM/yyyy HH:mm:ss.fff", CultureInfo.InvariantCulture)
  else: # else it's supposed to be DateTime
     deathdate = dd

if deathdate &lt; birthdate:
   return 0

# Calculate the age based on the date of birth and date of death.
  return Convert.ToInt32(Math.Truncate((deathdate - birthdate).TotalDays / 365))   
    </Expression>
    <AllowNulls>false</AllowNulls>
    </Options>
  </Generator>
</Generators>

DEATH DATE

The generator generates dates of death based on the dates of birth with the possibility to set lifespan that depending on the weight list.

<?xml version="1.0" encoding="utf-8"?>
<Generators xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Generator>
    <Category>Personal</Category>
    <Name>Death Date</Name>
    <ShortName>Death Date</ShortName>
    <Description>fill with dates of death with different probability</Description>
    <Type>SqlPythonGenerator</Type>
    <ColumnTypes>
      <Type>date</Type>
      <Type>datetime</Type>
      <Type>datetime2</Type>
      <Type>smalldatetime</Type>
    </ColumnTypes>
    <Matches>
      <Match table=".*" column="^Death.*Date$" priority="90" />
      <Match table=".*" column="^Date.*Death$" priority="90" />
      <Match table=".*" column="^Day.*Death$" priority="90" />
      <Match table=".*" column=".*dead" priority="10" />
      <Match table=".*" column="^DOD$" priority="20" />
      <Match table=".*" column="^ddate$" priority="30" />
      <Match table=".*" column="^dday$" priority="10" />      
    </Matches>
    <Options>
      <Expression>#This generator generates dates of death basing on the dates of birth with the option to set lifespan that is based on the weight list.
#It has been made by request of our user from forum http://forums.devart.com/viewtopic.php?f=23&amp;t=33396   

import random
from System import *
from System import DateTime
from System import DBNull
from System.Globalization import CultureInfo

def main(): 

# DOB - name of the column in the current table, that contains dates of birth. 
# This name should be replaced with the actual name of the column of your table, for example DAYOFBIRTHDAY.
    
  bd = DOB
  if str(bd) == '':
   return DBNull.Value
  
# "dd/MM/yyyy" - is the format of date of birth. If it differs, make the following modifications.
  bdFormat = "dd/MM/yyyy" 
 
  x = random.randint(1, 100)

  deltaDays  = 0 

# Setting of percentage ranges for ages. The sum of all ranges should be equal to 100%.
  
  if x &lt;= 5: # 5 %
      deltaDays = random.randint(1, 3650) # 5 % -  death at age 1..3650 days (1-10 years) 
  elif x &gt; 5 and x &lt;= 20: # 15 %
      deltaDays =  random.randint(3650, 18250) # 15 % - death at age 3650..18250 days (10-50 years) 
  else: # 80 %
      deltaDays =  random.randint(18250, 36500) # 80 % - death at age 18250..36500 days (50-100 years) 

  birthdate = ''

# Check of column type for the date of birth column (if it is a string column, the conversion to the DATE format should be performed).
  if type(bd).__name__ == 'str':
     birthdate = DateTime.ParseExact(bd,bdFormat, CultureInfo.InvariantCulture)
  else: # else it's supposed to be DateTime
     birthdate = bd

# Addition of lifespan (in days).
  deathDate = birthdate.AddDays(deltaDays);

# If the date of death is bigger than the current date, the random quantity of days should be subtracted from the current date.
  
  if deathDate &gt;= DateTime.Now:
   deathDate = DateTime.Now.AddDays(-random.randint(1, 300));

  if deathDate &lt; birthdate:
    deathDate = birthdate

  return deathDate.Date   
    </Expression>
    <AllowNulls>false</AllowNulls>
    </Options>
  </Generator>
</Generators>

DATA UNIVERSAL NUMBERING SYSTEM NUMBER (DUNS Number)

The script generates DUNS numbers using the Luhn algorithm.

<?xml version="1.0" encoding="utf-8"?>
<Generators xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Generator>
    <Category>Business</Category>
    <Name>Data Universal Numbering System Number (DUNS Number)</Name>
    <ShortName>DUNS Number</ShortName>
    <Description>481795235, 796124832, 213794852, 873491252, ...</Description>
    <Type>SqlPythonGenerator</Type>
    <ColumnTypes>
      <Type>varchar</Type>
      <Type>char</Type>
      <Type>nvarchar</Type>
      <Type>nchar</Type>
      <Type>bigint</Type>
      <Type>int</Type>
    </ColumnTypes>
    <Matches>
      <Match table=".*" column="^DUNS$" priority="100" minlen="9" maxlen="10" />
      <Match table=".*" column="DUNS.*code" priority="60" minlen="9" maxlen="10" />
    </Matches>
    <Options>
      <Expression># This script generates DUNS numbers using the Luhn algorithm

import random

random.seed(config["seed"])

# Calculate checksum by mod 10
def luhn_checksum(input_number):
    def digits_of(n):
        return [int(d) for d in str(n)]
    digits = digits_of(input_number)
    odd_digits = digits[-1::-2]
    even_digits = digits[-2::-2]
    checksum = 0
    checksum += sum(odd_digits)
    for d in even_digits:
        checksum += sum(digits_of(d*2))
    return checksum % 10

def calculate_luhn(partial_input_number):
    check_digit = luhn_checksum(int(partial_input_number) * 10)
    return check_digit if check_digit == 0 else 10 - check_digit

def GetDigit():
    value = random.sample(range(1, 10), 8)
    return ''.join(str(x) for x in value)

def main():
  i = 0
  while i &lt;= config["n_rows"]:
    i = i + 1
    d = GetDigit()
    yield str(d) + str(calculate_luhn(d))

main()</Expression>
    </Options>
  </Generator>

  <Generator>
    <Category>Business</Category>
    <Name>Data Universal Numbering System Number (DUNS Number)</Name>
    <ShortName>DUNS Number</ShortName>
    <Description>48-179-5235, 79-612-4832, 21-379-4852, 87-349-1252, ...</Description>
    <Type>SqlPythonGenerator</Type>
    <ColumnTypes>
      <Type>varchar</Type>
      <Type>char</Type>
      <Type>nvarchar</Type>
      <Type>nchar</Type>
      <Type>text</Type>
      <Type>ntext</Type>
      <Type>varchar(max)</Type>
      <Type>nvarchar(max)</Type>
    </ColumnTypes>
    <Matches>
      <Match table=".*" column="^DUNS$" priority="100" minlen="11" />
      <Match table=".*" column="DUNS.*code" priority="60" minlen="11" />
    </Matches>
    <Options>
      <Expression># This script generates DUNS numbers using the Luhn algorithm

import random

random.seed(config["seed"])

def luhn_checksum(input_number):
    def digits_of(n):
        return [int(d) for d in str(n)]
    digits = digits_of(input_number)
    odd_digits = digits[-1::-2]
    even_digits = digits[-2::-2]
    checksum = 0
    checksum += sum(odd_digits)
    for d in even_digits:
        checksum += sum(digits_of(d*2))
    return checksum % 10

def calculate_luhn(partial_input_number):
    check_digit = luhn_checksum(int(partial_input_number) * 10)
    return check_digit if check_digit == 0 else 10 - check_digit

def GetDigit():
    value = random.sample(range(1, 10), 8)
    return ''.join(str(x) for x in value)

def formatOutPut(value):
  s = '{0}{1}-{2}{3}{4}-{5}{6}{7}{8}'    
  return s.format(*list(value))

def main():
  i = 0
  while i &lt;= config["n_rows"]:
    i = i + 1
    d = GetDigit()
    yield formatOutPut(str(d) + str(calculate_luhn(d)))

main()</Expression>
    </Options>
  </Generator>
</Generators>

URL (Gravatar image)

The script retrieves a Gravatar image url based on an email address.

<?xml version="1.0" encoding="utf-8"?>
<Generators xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Generator>
    <Category>Personal</Category>
    <Name>URL (Gravatar image)</Name>
    <ShortName>URL (Gravatar img)</ShortName>
    <Description>http://www.gravatar.com/avatar/6ffc45896e63ba040e769b8b9d150107?s=40&amp;d=http%3A%2F%2Fwww.example.com%2Fdefault.jpg ;...</Description>
    <Type>SqlPythonGenerator</Type>
    <ColumnTypes>
      <Type>varchar</Type>
      <Type>char</Type>
      <Type>nvarchar</Type>
      <Type>nchar</Type>
      <Type>text</Type>
      <Type>ntext</Type>
      <Type>varchar(max)</Type>
      <Type>nvarchar(max)</Type>
    </ColumnTypes>
    <Matches>
      <Match table=".*" column="^gravatar$" priority="100" minlen="150" />
    </Matches>
    <Options>
      <Expression># This script retrieves a Gravatar image url based on an email address
import urllib, hashlib
 
# Set your variables here
email = "[email protected]"
# email = email_column_name
default = "http://www.example.com/default.jpg"
size = 40
 
# construct the url
gravatar_url = "http://www.gravatar.com/avatar/" + hashlib.md5(email.lower()).hexdigest() + "?"
gravatar_url += urllib.urlencode({'d':default, 's':str(size)})

gravatar_url</Expression>
    </Options>
  </Generator>
</Generators>

INTEGRATED CIRCUIT CARD ID (ICCID)

The script generates primary account numbers that have 19 digits long.

<?xml version="1.0" encoding="utf-8"?>
<Generators xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Generator>
    <Category>Personal</Category>
    <Name>Integrated Circuit Card ID (ICCID)</Name>
    <ShortName>ICCID</ShortName>
    <Description>8938605388545996967, 8937103007991618431, 8935602817534397104, ...</Description>
    <Type>SqlPythonGenerator</Type>
    <ColumnTypes>
      <Type>varchar</Type>
      <Type>char</Type>
      <Type>nvarchar</Type>
      <Type>nchar</Type>
      <Type>bigint</Type>
      <Type>decimal</Type>
    </ColumnTypes>
    <Matches>
      <Match table=".*" column="^ICCID$" priority="105" minlen="19" />   
    </Matches>
    <Options>
      <Expression>import random
import csv

random.seed(config["seed"])

# Calculates luhn  checksum
def luhn_checksum(input_number):
    def digits_of(n):
        return [int(d) for d in str(n)]
    digits = digits_of(input_number)
    odd_digits = digits[-1::-2]
    even_digits = digits[-2::-2]
    checksum = 0
    checksum += sum(odd_digits)
    for d in even_digits:
        checksum += sum(digits_of(d*2))
    return checksum % 10

def GetDigit(codes):
    
# Major industry identifier 
    MII = '89' 
 
# Country code
    CC = codes[random.randint(1, len(codes)-1)]
 
# Issuer identifier
    randoms=[]
    for i in range (4- len(CC)):
       randoms.append(0)
    II = ''.join(str(x) for x in randoms) + str(random.randint(1,5))
 
    value = MII + CC + II
 
# Individual account identification
    randoms=[]
    for i in range (18- len(value)):
       randoms.append(random.randint(0,9))
    IAI = ''.join(str(x) for x in randoms)
 
    value =  value + IAI      
    return  value 

# Getting list of telephone country codes from file CountryCodes.csv
def countryCodes(): 
  var_columnName = 'country_dial'
  var_fileName = config["config_path"] +'\\'+  r"CountryCodes.csv"  
  with open(var_fileName,"rb") as csvfile:
    var_reader = csv.DictReader(csvfile, delimiter=';', quotechar='"') 
    for row in var_reader:
      yield str(row[var_columnName]).upper()

def main():
  random.seed(config["seed"])
  codes = list(countryCodes())
  i = 0
  while i &lt;= config["n_rows"]:
    i = i + 1
    d = GetDigit(codes)
    yield d + str(luhn_checksum(d))

main()  
    </Expression>
    <AllowNulls>false</AllowNulls>
    </Options>
  </Generator>
</Generators>

LANGUAGE CODE (ISO 639-1 alpha-2, UpperCase)

The script reads data from a csv file, converts it to uppercase, and shuffles data in the list.

<?xml version="1.0" encoding="utf-8"?>
<Generators xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <Generator>
    <Category>Product</Category>
    <Name>Language Code (ISO 639-1 alpha-2, UpperCase)</Name>
    <ShortName>Language Code 2 (u)</ShortName>
    <Description>ES, FR, ZH, UK, TH, ZU, EN, JA, NL, ...</Description>
    <Type>SqlPythonGenerator</Type>
    <ColumnTypes>
      <Type>varchar</Type>
      <Type>char</Type>
      <Type>nvarchar</Type>
      <Type>nchar</Type>
      <Type>text</Type>
      <Type>ntext</Type>
      <Type>varchar(max)</Type>
      <Type>nvarchar(max)</Type>
    </ColumnTypes>
    <Matches>
      <Match table="Language" column="Lang.*id$" priority="21" minlen="2" />
    </Matches>
    <Options>
      <Expression># This script reads data from a csv file, converts it to uppercase, and shuffles data in the list

import csv
import random

random.seed(config["seed"])
var_fileName = config["config_path"] +'\\'+  r"LanguageCodes.csv" 
var_columnName = 'ISO639-1-Alpha-2'

def values(): 
  with open(var_fileName,"rb") as csvfile:
    var_reader = csv.DictReader(csvfile, delimiter=';', quotechar='"') 
    for row in var_reader:
      yield str(row[var_columnName]).upper()

var_lst = list(values())
random.shuffle(var_lst)
var_lst</Expression>
    </Options>
  </Generator>
</Generators>

Language Code (ISO 639-1 alpha-3, UpperCase)

The script reads data from a csv file, converts it to uppercase, and shuffles data in the list.

<?xml version="1.0" encoding="utf-8"?>
<Generators xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
  <Generator>
    <Category>Product</Category>
    <Name>Language Code (ISO 639-1 alpha-3, UpperCase)</Name>
    <ShortName>Language Code 3 (u)</ShortName>
    <Description>SPA, FRE, CHI, UKR, THA, ZUL, ENG, JPN, DUT, ...</Description>
    <Type>SqlPythonGenerator</Type>
    <ColumnTypes>
      <Type>varchar</Type>
      <Type>char</Type>
      <Type>nvarchar</Type>
      <Type>nchar</Type>
      <Type>text</Type>
      <Type>ntext</Type>
      <Type>varchar(max)</Type>
      <Type>nvarchar(max)</Type>
    </ColumnTypes>
    <Matches>
      <Match table="Language" column="Lang.*id$" priority="31" minlen="3" />
    </Matches>
    <Options>
      <Expression># This script reads data from a csv file, converts it to uppercase, and shuffles data in the list

import csv
import random

random.seed(config["seed"])
var_fileName = config["config_path"] +'\\'+  r"LanguageCodes.csv" 
var_columnName = 'ISO639-1-Alpha-3b'

def values(): 
  with open(var_fileName,"rb") as csvfile:
    var_reader = csv.DictReader(csvfile, delimiter=';', quotechar='"') 
    for row in var_reader:
      yield str(row[var_columnName]).upper()

var_lst = list(values())
random.shuffle(var_lst)
var_lst</Expression>
    </Options>
  </Generator>
</Generators>

LOREM PIXEL

The script dynamically generates placeholder images.

<?xml version="1.0" encoding="utf-8"?>
<Generators xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Generator>
    <Category>IT</Category>
    <Name>Lorem Pixel</Name>
    <ShortName>Lorem Pixel</ShortName>
    <Description>dynamically generates placeholder images (!time-consuming!)</Description>
    <Type>SqlPythonGenerator</Type>
    <ColumnTypes>
      <Type>varbinary(max)</Type>
      <Type>image</Type>
    </ColumnTypes>
    <Matches>
    </Matches>
    <Options>
      <Expression>import clr
clr.AddReference("System.Net")
from System.Net import WebClient

import clr
clr.AddReference("System.IO")
from System.IO import MemoryStream, Stream

import clr
clr.AddReference("System.Drawing")
from System.Drawing import Image

clr.AddReference("System")
from System import Random
random = Random()

categories = ['abstract', 'animals', 'business', 'cats', 'city', 'food', 'nightlife' , 'fashion', 'people', 'nature', 'sports', 'technics', 'transport']

def getUrl():
  varCategoryIndex = random.Next(len(categories))
  varCategoryName =categories[varCategoryIndex]
  varUrlTemplate = 'http://lorempixel.com/100/100/{0}/{1}'    
  varUrl = varUrlTemplate.format(varCategoryName,varCategoryName) 
  return varUrl

def main():
  i = 0
  while i &lt;= config["n_rows"]:
    webClient = WebClient()
    dataImage = webClient.DownloadData(getUrl())
    stream = MemoryStream(dataImage)    
    webClient.Dispose()
    yield Image.FromStream(stream)
    stream.Dispose()

main()</Expression>
    </Options>
  </Generator>
</Generators>

MD5 HASH

The script generates MD5 hash based on values in some string or a table column.

<?xml version="1.0" encoding="utf-8"?>
<Generators xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
  <Generator>
    <Category>IT</Category>
    <Name>MD5 Hash</Name>
    <ShortName>MD5 Hash</ShortName>
    <Description>605669cab962bf944d99ce89cf9e58d9, ee3c6e8ed9c27c45f161ea416c997df8, 8625e1de7be14c39b1d14dc03d822497, ...</Description>
    <Type>SqlPythonGenerator</Type>
    <ColumnTypes>
      <Type>varchar</Type>
      <Type>char</Type>
      <Type>nvarchar</Type>
      <Type>nchar</Type>
      <Type>text</Type>
      <Type>ntext</Type>
      <Type>varchar(max)</Type>
      <Type>nvarchar(max)</Type>
    </ColumnTypes>
    <Matches>
    </Matches>
    <Options>
      <Expression># This script generates MD5 hash based on values in some string or a table column 
import hashlib

def HashIt():
  # You can also use 'sha1', 'sha256', 'sha384', 'sha512'
  m = hashlib.md5() 
  # You can use a column_name instead of 'Some Value'
  m.update('Some Value')
  return m.hexdigest()
  
HashIt()</Expression>
    </Options>
  </Generator>
</Generators>

For detailed instructions on how to generate related data elements, refer to our blog post.