NoiseDate
Randomly add or subtract a duration within the provided ratio interval to the original date value.
Parameters¶
| Name | Description | Default | Required | Supported DB types |
|---|---|---|---|---|
| column | The name of the column to be affected | Yes | date, timestamp, timestamptz | |
| min_ratio | The minimum random value for noise. The value must be in PostgreSQL interval format, e. g. 1 year 2 mons 3 day 04:05:06.07 |
5% from max_ration parameter | No | - |
| max_ratio | The maximum random value for noise. The value must be in PostgreSQL interval format, e. g. 1 year 2 mons 3 day 04:05:06.07 |
Yes | - | |
| min | Min threshold date (and/or time) of value. The value has the same format as column parameter |
No | - | |
| max | Max threshold date (and/or time) of value. The value has the same format as column parameter |
No | - | |
| truncate | Truncate the date to the specified part (nanosecond, microsecond, millisecond, second, minute, hour, day, month, year). The truncate operation is not applied by default. |
No | - | |
| engine | The engine used for generating the values [random, hash]. Use hash for deterministic generation |
random |
No | - |
Dynamic parameters¶
| Parameter | Supported types |
|---|---|
| min | date, timestamp, timestamptz |
| max | date, timestamp, timestamptz |
Description¶
The NoiseDate transformer randomly generates duration between min_ratio and max_ratio parameter and adds it to or
subtracts it from the original date value. The min_ratio or max_ratio parameters must be written in the
PostgreSQL interval format.
You can also truncate the resulted date up to a specified part by setting the truncate parameter.
In case you have constraints on the date range, you can set the min and max parameters to specify the threshold
values. The values for min and max must have the same format as the column parameter. Parameters min and max
support dynamic mode.
Info
If the noised value exceeds the max threshold, the transformer will set the value to max. If the noised value
is lower than the min threshold, the transformer will set the value to min.
The engine parameter allows you to choose between random and hash engines for generating values. Read more about the
engines in the Transformation engines section.
Example: Adding noise to the modified date¶
In the following example, the original timestamp value of modifieddate will be noised up
to 1 year 2 months 3 days 4 hours 5 minutes 6 seconds and 7 milliseconds with truncation up to the month part.
- schema: "humanresources"
name: "jobcandidate"
transformers:
- name: "NoiseDate"
params:
column: "hiredate"
max_ratio: "1 year 2 mons 3 day 04:05:06.07"
truncate: "month"
max: "2020-01-01 00:00:00"
Example: Adding noise to the modified date with dynamic min parameter with hash engine¶
In the following example, the original timestamp value of hiredate will be noised up
to 1 year 2 months 3 days 4 hours 5 minutes 6 seconds and 7 milliseconds with truncation up to the month part.
The max threshold is set to 2020-01-01 00:00:00, and the min threshold is set to the birthdate column. If the
birthdate column is NULL, the default value 1990-01-01 will be used. The hash engine is used for deterministic
generation - the same input will always produce the same output.
- schema: "humanresources"
name: "employee"
transformers:
- name: "NoiseDate"
params:
column: "hiredate"
max_ratio: "1 year 2 mons 3 day 04:05:06.07"
truncate: "month"
max: "2020-01-01 00:00:00"
engine: "hash"
dynamic_params:
min:
column: "birthdate"
default: "1990-01-01"
Result
| Column | OriginalValue | TransformedValue |
|---|---|---|
| hiredate | 2009-01-14 | 2010-08-01 |