GlueDataBrew / Client / create_ruleset
create_ruleset#
- GlueDataBrew.Client.create_ruleset(**kwargs)#
- Creates a new ruleset that can be used in a profile job to validate the data quality of a dataset. - See also: AWS API Documentation - Request Syntax - response = client.create_ruleset( Name='string', Description='string', TargetArn='string', Rules=[ { 'Name': 'string', 'Disabled': True|False, 'CheckExpression': 'string', 'SubstitutionMap': { 'string': 'string' }, 'Threshold': { 'Value': 123.0, 'Type': 'GREATER_THAN_OR_EQUAL'|'LESS_THAN_OR_EQUAL'|'GREATER_THAN'|'LESS_THAN', 'Unit': 'COUNT'|'PERCENTAGE' }, 'ColumnSelectors': [ { 'Regex': 'string', 'Name': 'string' }, ] }, ], Tags={ 'string': 'string' } ) - Parameters:
- Name (string) – - [REQUIRED] - The name of the ruleset to be created. Valid characters are alphanumeric (A-Z, a-z, 0-9), hyphen (-), period (.), and space. 
- Description (string) – The description of the ruleset. 
- TargetArn (string) – - [REQUIRED] - The Amazon Resource Name (ARN) of a resource (dataset) that the ruleset is associated with. 
- Rules (list) – - [REQUIRED] - A list of rules that are defined with the ruleset. A rule includes one or more checks to be validated on a DataBrew dataset. - (dict) – - Represents a single data quality requirement that should be validated in the scope of this dataset. - Name (string) – [REQUIRED] - The name of the rule. 
- Disabled (boolean) – - A value that specifies whether the rule is disabled. Once a rule is disabled, a profile job will not validate it during a job run. Default value is false. 
- CheckExpression (string) – [REQUIRED] - The expression which includes column references, condition names followed by variable references, possibly grouped and combined with other conditions. For example, - (:col1 starts_with :prefix1 or :col1 starts_with :prefix2) and (:col1 ends_with :suffix1 or :col1 ends_with :suffix2). Column and value references are substitution variables that should start with the ‘:’ symbol. Depending on the context, substitution variables’ values can be either an actual value or a column name. These values are defined in the SubstitutionMap. If a CheckExpression starts with a column reference, then ColumnSelectors in the rule should be null. If ColumnSelectors has been defined, then there should be no column reference in the left side of a condition, for example,- is_between :val1 and :val2.- For more information, see Available checks 
- SubstitutionMap (dict) – - The map of substitution variable names to their values used in a check expression. Variable names should start with a ‘:’ (colon). Variable values can either be actual values or column names. To differentiate between the two, column names should be enclosed in backticks, for example, - ":col1": "`Column A`".- (string) – - (string) – 
 
 
- Threshold (dict) – - The threshold used with a non-aggregate check expression. Non-aggregate check expressions will be applied to each row in a specific column, and the threshold will be used to determine whether the validation succeeds. - Value (float) – [REQUIRED] - The value of a threshold. 
- Type (string) – - The type of a threshold. Used for comparison of an actual count of rows that satisfy the rule to the threshold value. 
- Unit (string) – - Unit of threshold value. Can be either a COUNT or PERCENTAGE of the full sample size used for validation. 
 
- ColumnSelectors (list) – - List of column selectors. Selectors can be used to select columns using a name or regular expression from the dataset. Rule will be applied to selected columns. - (dict) – - Selector of a column from a dataset for profile job configuration. One selector includes either a column name or a regular expression. - Regex (string) – - A regular expression for selecting a column from a dataset. 
- Name (string) – - The name of a column from a dataset. 
 
 
 
 
- Tags (dict) – - Metadata tags to apply to the ruleset. - (string) – - (string) – 
 
 
 
- Return type:
- dict 
- Returns:
- Response Syntax - { 'Name': 'string' } - Response Structure - (dict) – - Name (string) – - The unique name of the created ruleset. 
 
 
 - Exceptions - GlueDataBrew.Client.exceptions.ConflictException
- GlueDataBrew.Client.exceptions.ServiceQuotaExceededException
- GlueDataBrew.Client.exceptions.ValidationException