How to Easily perform Data Masking of Social Security Numbers (SSNs) in Log files or Events in 4 Ways using Data Bots

This blog post covers 4 data masking techniques and data obfuscation techniques that you can implement with Robotic Data Automation (RDA) to mask or hide sensitive data or personally identifiable information (PII) like social security numbers (SSNs) that may have crept unintentionally in logs or events.

4 Approaches to Easily Mask Social Security Numbers (SSNs) with Data Bots

  1. Data Masking
  2. MD5 Hash
  3. SHA256 Hash
  4. Data Obfuscation

One solution you may think of is to just change the data-producing app to turn off or change the logging, but in reality, this can take quite some time, going through testing, change control process, and pushing to production. Another issue could be that you don’t know what other integrations will break if the data producer makes this change. Maybe there is another integration that needs this data.

This is one trend we are seeing more often.

Data producers don’t know how, when and what data consumers want to consume. Hence Data producers generally take the approach of producing as much data as they can and let the data consumers decide what they want to consume.

In such situations, it is best to be able to selectively filter, clean, shape, or mask data as desired before you send this data to analytics platforms like Splunk, Elasticsearch, or Snowflake

Here we will discuss 4 approaches to mask SSN with data bots. In these approaches, you will see that we are simply invoking the bot by its name and specifying values for parameters.

  • @dm – data management bots
  • column – name of the field which holds SSN value

1. Data Masking Bot

-->  @dm:mask     columns='customer_ssn' & pos=4 & char='#'

2. Data Masking with MD5 Hash

-->  @dm:map from='customer_ssn' & to='customer_ssn' & func='md5'

3. Data Masking with SHA-256 Hash

-->  @dm:map from='customer_ssn' & to='customer_ssn' & func='sha256'

4. Data Obfuscation Bot

-->  @dm:eval customer_ssn="synthetic('ssn')"

https://app.tella.tv/video/ckvilxvx9000308l3gewh3ddx/embed

We have more than 500 such bots available out of the box in our platform and we are adding more every week. You can invoke these bots by simply calling their name and you can chain these bots in a pipeline to implement more complex scenarios or workflows.

Oh, by the way, you don’t need to be a Python expert or Software Engineer, or Java Programmer to write these pipelines. These pipelines appear more like a template or like YAML markup language and are very easy to create. That is the reason why our platform has a very broad appeal to a lot of IT professionals including IT Ops, ITSM, CloudOps personnel, DevOps/SREs, enterprise tool admins, citizen automation developers, and more.

Learn More or Get Started

Visit https://roboticdata.ai to learn more about RDA. If you are interested in kicking the tires, you can completely signup for free and get started right away with our cloud-hosted SaaS offering.

Shout out to us if you want to let us know your feedback, or if you have any questions info@roboticdata.ai

Tejo Prayaga
Tejo Prayaga
Tejo Prayaga is a high-growth Product Management & Marketing leader. Tejo has extensive experience helping enterprises build, scale, and market innovative products and solutions that use modern technologies like Data Automation, Artificial Intelligence, Machine Learning, Microservices, Cloud Services, and more. Startup geek, Ex-Cisco, MBA, Speaker, and Toastmaster!! https://www.linkedin.com/in/tprayaga