> ## Documentation Index
> Fetch the complete documentation index at: https://private-7c7dfe99-mintlify-fbfa8bee.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

> Calculates the value of `(P(tag = 1) - P(tag = 0))(log(P(tag = 1)) - log(P(tag = 0)))` for each category.

# categoricalInformationValue

<h2 id="categoricalInformationValue">
  categoricalInformationValue
</h2>

Introduced in: v20.1.0

Calculates the information value (IV) for categorical features in relation to a binary target variable.

For each category, the function computes: `(P(tag = 1) - P(tag = 0)) × (log(P(tag = 1)) - log(P(tag = 0)))`

where:

* P(tag = 1) is the probability that the target equals 1 for the given category
* P(tag = 0) is the probability that the target equals 0 for the given category

Information Value is a statistic used to measure the strength of a categorical feature's relationship with a binary target variable in predictive modeling.
Higher absolute values indicate stronger predictive power.

The result indicates how much each discrete (categorical) feature `[category1, category2, ...]` contributes to a learning model which predicts the value of `tag`.

**Syntax**

```sql theme={null}
categoricalInformationValue(category1[, category2, ...,]tag)
```

**Arguments**

* `category1, category2, ...` — One or more categorical features to analyze. Each category should contain discrete values. [`UInt8`](/reference/data-types/int-uint)
* `tag` — Binary target variable for prediction. Should contain values 0 and 1. [`UInt8`](/reference/data-types/int-uint)

**Returned value**

Returns an array of Float64 values representing the information value for each unique combination of categories. Each value indicates the predictive strength of that category combination for the target variable. [`Array(Float64)`](/reference/data-types/array)

**Examples**

**Basic usage analyzing age groups vs mobile usage**

```sql title=Query theme={null}
-- Using the metrica.hits dataset (available on https://sql.clickhouse.com/) to analyze age-mobile relationship
SELECT categoricalInformationValue(Age < 15, IsMobile)
FROM metrica.hits;
```

```response title=Response theme={null}
[0.0014814694805292418]
```

**Multiple categorical features with user demographics**

```sql title=Query theme={null}
SELECT categoricalInformationValue(
    Sex,                 -- 0=male, 1=female
    toUInt8(Age < 25),   -- 0=25+, 1=under 25
    toUInt8(IsMobile)    -- 0=desktop, 1=mobile
) AS iv_values
FROM metrica.hits
WHERE Sex IN (0, 1);
```

```response title=Response theme={null}
[0.00018965785460692887,0.004973668839403392]
```
