From One Event to Thousands: Supercharging Splunk Testing with Eventgen
One of the common obstacles that a Splunk developer may encounter is the lack of real, live data to breathe life into new alerts and dashboards — to double-check their functionality and preview results before moving to production. This is where Eventgen comes in.
With Splunk Eventgen, we can generate real-time data from a single event. Sounds too good to be true? Wondering if it’s going to be a tough configuration? Well, fear not — let’s take a single event and turn it into the origin story for real-time data.
Step-by-Step: Setting Up Splunk Eventgen for Real-Time Data Simulation in Splunk
In the steps below, we will see how we can convert one of the complex logs from Palo Alto into real-time data with 24 replacement tokens and supporting files for these tokens.
Step 1: Download Eventgen from Splunkbase
Step 2: Create a directory named samples at “$SPLUNK_HOME/etc/apps/SA-Eventgen/” and place the sample log files inside this directory.
Example sample log: (paloaltosample.log)
<3163>Apr 07 09:16:54 gateway5008-lgi.corp LEEF:1.0|Palo Alto Networks|PAN-OS Syslog
Integration|8.1.6|trojan/ZIP.worm.zxtt(334838427)|ReceiveTime=2025/04/07 09:16:54
16:43:53|SerialNumber=470699273690|cat=THREAT|Subtype=virus|devTime=Apr 07 2025 09:16:54
GMT|src=124.2.54.222|dst=207.193.173.237|srcPostNAT=7.182.18.27|dstPostNAT=227.128.72.85|RuleName=
Cyber_Watcher_4045|usrName=Vortex_Gladiator_7553|SourceUser=Vortex_Gladiator_7553|DestinationUser=|
Application=web-browsing|VirtualSystem=vsys1|SourceZone=INSIDE-ZN|DestinationZone=OUTSIDE-
ZN|IngressInterface=ethernet1/1|EgressInterface=ethernet1/3|LogForwardingProfile=testForwarder|SessionI
D=8576|RepeatCount=1|srcPort=5731|dstPort=4909|srcPostNATPort=3510|dstPostNATPort=2862|Flags=0x4
06000|proto=tcp|action=alert|Miscellaneous=\"https://security.example.test/data/du/20112024_UG_FAQ.xml
\"|ThreatID=trojan/ZIP.worm.zxtt(334838427)|URLCategory=faculty-examination-
workshop|sev=2|Severity=Medium|Direction=server-to-
client|sequence=685199276|ActionFlags=0xa000000000000000|SourceLocation=10.0.0.0-
10.255.255.255|DestinationLocation=testPlace|ContentType=|PCAP_ID=0|FileDigest=|Cloud=|URLIndex=5|Re
questMethod=|Subject=|DeviceGroupHierarchyL1=12|DeviceGroupHierarchyL2=0|DeviceGroupHierarchyL3=
0|DeviceGroupHierarchyL4=0|vSrcName=|DeviceName=testName|SrcUUID=|DstUUID=|TunnelID=0|Monitor
Tag=|ParentSessionID=0|ParentStartTime=|TunnelType=N/A|ThreatCategory=xml|ContentVer=Antivirus-
2969-3479
Step 3: Create an eventgen.conf file at “$SPLUNK_HOME/etc/apps/SA-Eventgen/local/”
This .conf file will include all the settings required to generate the real-time sample events.
Key Configuration Options in eventgen.conf
- [stanzaname] — The stanza name must match the exact filename of the sample log inside the samples folder.
- index — The name of the index where the generated events will be stored.
- host — The value of the host field for the generated events.
Token Replacement Settings
- token. <n>.token — A regular expression to capture the part of the event that needs to be replaced.
- token. <n>.replacementType — Specifies the type of replacement. Key types include:
- timestamp — For replacing time fields.
- random — Generates random values like integer, IP address and guid.
- file — Values are replaced from a specified file.
- mvfile — Allows replacement of multiple values from the same row of a file.
- token. <n>.replacement — The value to use as the replacement.
- for the random replacementType - ipv4, ipv6, guid, mac, integer[<start>:<end>], float[<start>:<end>] .
- for file replacementType – mention the name of the file and place it in the sample folder.
- For mvfile replacementType – mention the name of the file and column number separated by a semi colon and place it in the sample folder.
- For timestamp replacementType – specify the strptime formatted string.
Example eventgen.conf: [paloaltosample.log]
index = palo
host = splunk
# Row ID
token.0.token = \<(\d+)\>
token.0.replacementType = random
token.0.replacement = integer[1111:9999]
# Timestamp
token.1.token = \w+\s\s\d\s\d{2}\:\d{2}\:\d{2}
token.1.replacementType = timestamp
token.1.replacement = %b %d %H:%M:%S
# ThreatID
token.2.token = (\w{6}\/\w{3}\.\w{3}\.\w{4}\(\d{9}\))
token.2.replacementType = mvfile
token.2.replacement = /opt/splunk/etc/apps/SA-Eventgen/samples/threat_ids.txt:1
# ReceiveTime
token.3.token = ReceiveTime\=(\d{4}\/\d{2}\/\d{2})
token.3.replacementType = timestamp
token.3.replacement = %Y/%m/%d %H:%M:%S
# SerialNumber
token.4.token = SerialNumber\=(\d+)
token.4.replacementType = file
token.4.replacement = "/opt/splunk/etc/apps/SA-Eventgen/samples/12_digit_combinations.txt"
# devTime
token.5.token = devTime\=(\w+\s\d{2}\s\d{4}\s\d{2}\:\d{2}\:\d{2})
token.5.replacementType = timestamp
token.5.replacement = %b %d %Y %H:%M:%S
# src
token.6.token = src\=([\d\.]+)
token.6.replacementType = random
token.6.replacement = ipv4
# dst
token.7.token = dst\=([\d\.]+)
token.7.replacementType = random
token.7.replacement = ipv4
# srcPostNAT
token.8.token = srcPostNAT\=([\d\.]+)
token.8.replacementType = random
token.8.replacement = ipv4
# dstPostNAT
token.9.token = dstPostNAT\=([\d\.]+)
token.9.replacementType = random
token.9.replacement = ipv4
# RuleName
token.10.token = RuleName\=([\w\-\d]+)
token.10.replacementType = file
token.10.replacement = "/opt/splunk/etc/apps/SA-Eventgen/samples/rule_names.txt"
# usrName and SourceUser
token.11.token = (qradar\\\\user1)
token.11.replacementType = mvfile
token.11.replacement = /opt/splunk/etc/apps/SA-Eventgen/samples/user_names.txt:1
# Host
token.12.token = paloalto\.paseries\.test
token.12.replacementType = file
token.12.replacement = "/opt/splunk/etc/apps/SA-Eventgen/samples/hostnames.txt"
# Miscellaneous
token.13.token = Miscellaneous\=\\\"([\w\d\/.]+)
token.13.replacementType = mvfile
token.13.replacement = /opt/splunk/etc/apps/SA-Eventgen/samples/miscellaneous.csv:1
# SessionID
token.15.token = SessionID\=(\d{4})
token.15.replacementType = random
token.15.replacement = integer[1111:9999]
# srcPort
token.16.token = srcPort\=(\d+)
token.16.replacementType = random
token.16.replacement = integer[1111:9999]
# dstPort
token.17.token = dstPort\=(\d+)
token.17.replacementType = random
token.17.replacement = integer[1111:9999]
# srcPostNATPort
token.18.token = srcPostNATPort\=(\d+)
token.18.replacementType = random
token.18.replacement = integer[1111:9999]
# dstPostNATPort
token.19.token = dstPostNATPort\=(\d+)
token.19.replacementType = random
token.19.replacement = integer[1111:9999]
# sequence
token.20.token = sequence\=(\d+)
token.20.replacementType = file
token.20.replacement = "/opt/splunk/etc/apps/SA-Eventgen/samples/9_digit_combos.txt"
# URLCategory
token.21.token = URLCategory\=([\w-]+)
token.21.replacementType = file
token.21.replacement = "/opt/splunk/etc/apps/SA-Eventgen/samples/url_categories.txt"
# sev
token.22.token = sev\=(\d)
token.22.replacementType = mvfile
token.22.replacement = /opt/splunk/etc/apps/SA-Eventgen/samples/severity.csv:1
# ThreatCategory
token.23.token = ThreatCategory\=(\w+)
token.23.replacementType = mvfile
token.23.replacement = /opt/splunk/etc/apps/SA-Eventgen/samples/miscellaneous.csv:2
# Severity
token.24.token = Severity\=(\w+)
token.24.replacementType = mvfile
token.24.replacement = /opt/splunk/etc/apps/SA-Eventgen/samples/severity.csv:2
Notes
- If we look closely at the above “eventgen.conf”, it becomes clear that many replacement tokens use external files for dynamic values. These files were generated using simple Python scripts, as shown below:
import randomoutput_file = "/opt/splunk/etc/apps/SA-Eventgen/samples/url_categories.txt"# Define category prefixes and suffixes prefixes = ["educational", "financial", "healthcare", "government", "technology", "retail", "automotive", "hospitality", "entertainment", "legal"]suffixes = ["institutions", "services", "centers", "departments", "companies", "agencies", "corporations", "networks", "providers", "solutions"]# Generate unique URL categoriesnum_categories = 5000 # Change as neededurl_categories = set()while len(url_categories) < num_categories:category = f"{random.choice(prefixes)}-{random.choice(suffixes)}" url_categories.add(category) # Save to filewith open(output_file, "w") as f: for category in url_categories: f.write(category + "\n") print(f"Generated {num_categories} unique URL categories and saved to {output_file}") - Make sure the modular input named "Eventgen" is enabled in Settings >> Data Inputs.
- One of the most interesting replacement types is mvfile because, if we look closely at the example eventgen.conf, we will notice that mvfile is used for values that repeat in the sample event, such as ThreatCategory token. If we use file as the replacement type, each occurrence of ThreatCategory would be different from the others, whereas mvfile helps us keep them the same.
- Additionally, mvfile is useful when we need to fetch the replacement value from the same line. For instance, the sev and Severity tokens should be consistent, i.e., the sev value of 1 should match the severity as low.
Real-Time Output Example
Once everything is configured, we will start receiving real-time data like the following:
Beyond the basics
Apart from the above settings, we can also control:
- The event generation rate
- Hostname values
- Many other fields using additional replacement tokens
For a comprehensive reference, check out the official Eventgen documentation.
Need help setting this up or want personalized support? Reach out to us — we’re here to help!
References