Typing a convict Config Schema

Typing a convict Config Schema

2022-01-09T22:11:03.331Z

Pull Request:

The convict lib is used to create a schema for config options:

const schema = {
  env: {
    doc: 'Environment that the application will run in',
    format: ['production', 'development', 'test'],
    default: 'development',
    env: 'NODE_ENV',
  },
  port: {
    doc: 'HTTP port n8n can be reached',
    format: Number,
    default: 5678,
    env: 'PORT',
    arg: 'port',
  },
  db: {
    host: {
      doc: 'Database host name/IP address',
      format: '*',
      default: 'server1.dev.test',
    },
    name: {
      doc: 'Database name',
      format: String,
      default: 'users',
    },
  },
};

const config = convict(schema);
const port = config.get('port');

For every option in the schema:

  • env defines the option's environment variable, and
  • format defines the option's required type, e.g. using a constructor like String and Number, an array literal, or a lib-specific format.

Native typings for the convict schema

Options can be grouped via nesting: db is not a config option but db.host is. The lib's typings provide correct return types for top-level options such as port in the example above, but fall back to any when accessing nested options via dotted path notation:

const port = config.get('port'); // number
const dbHost = config.get('db.host'); // any

Typings for get():

interface Config<T> {
  /**
   * @returns the current value of the name property. name can use dot
   * notation to reference nested values
   */
  get<K extends keyof T | string | null | undefined = undefined>(
    name?: K
  ): K extends null | undefined ? T : K extends keyof T ? T[K] : any;
  get<K extends keyof T, K2 extends keyof T[K]>(name: string): T[K][K2];
  get<K extends keyof T, K2 extends keyof T[K], K3 extends keyof T[K][K2]>(
    name: K
  ): T[K][K2][K3];
  get<
    K extends keyof T,
    K2 extends keyof T[K],
    K3 extends keyof T[K][K2],
    K4 extends keyof T[K][K2][K3]
  >(
    name: string
  ): T[K][K2][K3][K4];
}

interface convict {
  addFormat(format: convict.Format): void;
  addFormats(formats: { [name: string]: convict.Format }): void;
  addParser(parsers: convict.Parser | convict.Parser[]): void;
  <T>(config: convict.Schema<T> | string): convict.Config<T>;
}

For a longer explanation on get() typings in convict, refer to the example in Programming TypeScript by Boris Cherny, p. 135.

Config<T> takes a generic, which is the shape of the schema provided to convict(schema). Therefore, config.get(option) returns the option or the object accessible at the option key:

const shouldSave = config.get('executions').saveDataManualExecutions; // boolean
const processType = config.get('executions').process; // string

The full schema type can be extracted with typeof schema or ReturnType<typeof config.getProperties>.

This approach is limited:

  • It does not support dotted path notation.
  • It does not prevent access to a non-existent top-level or nested option.
  • It returns any for a non-existent top-level or nested option.
  • It returns string for an option that is an array of string literals format: ['sqlite', 'mariadb', 'mysqldb', 'postgresdb'], instead of returning a union of those string literals.
  • It provides autocompletion at each step of object traversal, instead of full autocompletion at root.
  • It is restricted to three levels of nesting with T[K][K2][K3][K4].

Re-typing the convict schema

Ideally, config.get(option) should be typed such that option is a union of strings...

  • where each string stands for a top-level key or a dotted path to a nested key in the schema, and
  • where each string is mapped to its correct return type, with the union-of-string-literals case covered.
const protocol = config.get('executions.process'); // 'main' | 'own'
const shouldSave = config.get('executions.shouldSave'); // boolean

This would also allow for:

  • full autocompletion at root,
  • type-checked config.get(option) calls, and
  • type inferral for indefinite nesting in the schema.

To achieve this, we can break the problem down into steps:

  1. Collect all the paths in the schema that lead to options.
  2. Group the collected paths based on their return types.
  3. Map all the collected option paths to their return types.
  4. Extend convict by adding a function signature with the mapping.

1. Collecting schema paths

We can traverse the full schema object with a recursive type:

type GetPathSegments<Traversable> = Traversable extends string
  ? []
  : {
      [K in keyof Traversable]: [K, ...GetPathSegments<Traversable[K]>];
    }[keyof Traversable];

type Result = GetPathSegments<typeof schema>;

When GetPathSegments<T> is first called with typeof schema, it receives the full shape of that object and checks if that shape is a string (i.e., the base case for recursion). Since an object is not a string, control flow follows the false branch in the ternary statement, into a self-accessing mapped type. This type extracts all the top-level keys of the full object and loops over each key, placing it in a collector array, and then recursing into the nested object.

Recursion will continue until Traversable extends string is true, i.e. the case where there is a key pointing to a string (as required by convict's schema) and so there is no further nested object to step into, returning an empty array []. Since this is the base case for the recursion, it resolves back up the stack to the initial call, and combines with the rest operator ... to unpack the arrays into the initial collector array: [K, ...PathSegments<Traversable[K]>]

This array has collected path segments for the key that is being looped over. To get at all arrays for all the keys looped over — to get a union of these arrays — we can immediately index into the object resulting from the mapped type. We use keyof Traversable to index into the mapped type, i.e. the same expression that we used to generate the union that we looped over in the first place.

Result represents every valid path in the full schema object, with all nested objects included.

["port", "default"] | ["queue", "bull", "redis", "port", "default"] | ["queue", "bull", "redis", "db", "default"] | ["queue", "bull", "redis", "timeoutThreshold", "default"] | <etc>

However, there is an issue. This traversal is so exhastive that it goes all the way back to every nested value prototype properties. This means, the <etc> in the collected segments contains e.g. fields like MAX_VALUE and NEGATIVE_INFINITY belonging to NumberConstructor.

To curb this exhaustive traversal, we need to validate the keys before traversal:

type ValidKeys<T> = keyof T extends string
  ? keyof T extends keyof NumberConstructor
    ? never
    : keyof T
  : never;
type GetPathSegments<T> = T extends string
  ? []
  : {
      [K in ValidKeys<T>]: [K, ...GetPathSegments<T[K]>];
    }[ValidKeys<T>];

With this restriction, Result is the following union, without <etc> containing prototype properties:

["host", "default"] | ["queue", "bull", "prefix", "default"] | ["queue", "bull", "prefix", "default"] |["queue", "bull", "redis", "db", "default"] | <etc>

Now that we have collected a union of arrays of string literals representing all valid non-prototype paths in the config schema, we need to remove any excess segments from our paths (e.g. the default key above) and join the arrays into dotted paths.

type RemoveExcess<T> = T extends [...infer Path, 'format' | 'default']
  ? Path extends string[]
    ? Path
    : never
  : never;

type JoinByDotting<T extends string[]> = T extends [infer F]
  ? F
  : T extends [infer F, ...infer R]
  ? F extends string
    ? R extends string[]
      ? `${F}.${JoinByDotting<R>}`
      : never
    : never
  : string;

type ToDottedPath<T> = JoinByDotting<RemoveExcess<T>>;

type Result = ToDottedPath<GetPathSegments<typeof schema>>;

RemoveExcess<T> is self-explanatory. In JoinByDotting<T>, T is the generic for the collection, F is the first string literal in the array, and R is the rest, i.e. the union of all remaining string literals in the array. JoinByDotting<T> works by finding the first string literal in the array, collecting it in a dotted path string literal, and recursing into the rest of the array. Recursion stops at the base case where the array contains a single element, at which point it resolves back up the stack to combine into a dotted path for each string literal array.

Result is now:

"port" | "queue.bull.redis.port" | "queue.bull.redis.db" | "queue.bull.redis.timeoutThreshold" | <etc>

This is a union of all strings that stand for all the paths in the schema that lead to options.

2. Grouping paths based on return type

To group all these paths based on their type, we first need to make an adjustment to the recursive collector GetPathSegments. The collector must now accept a second generic, so that we can specify the type of the path to collect and so group all paths based on multiple calls to the collector with different types as arguments.

Original with single generic Traversable:

type GetPathSegments<Traversable> = Traversable extends string
  ? []
  : {
      [K in keyof Traversable]: [K, ...GetPathSegments<Traversable[K]>];
    }[keyof Traversable];

Adjusted with two generics, Traversable and Filter:

type GetPathSegments<Traversable, Filter> = Traversable extends Filter
  ? []
  : {
      [K in ValidKeys<Traversable>]: [
        K,
        ...GetPathSegments<Traversable[K], Filter>
      ];
    }[ValidKeys<Traversable>];

Redoing the example above with this new version...

type Result = ToDottedPath<GetPathSegments<typeof schema, number>>;

yields a union of all the paths to options only for the requested type, here for number:

"port" | "queue.bull.redis.port" | "queue.bull.redis.timeoutThreshold" | "queue.bull.queueRecoveryInterval" | "database.mysqldb.port" | "database.postgresdb.port" | "binaryDataManager.persistedBinaryDataTTL" | <etc>

The call can be abstracted as a utility...

type CollectPathsByType<T> = ToDottedPath<GetPathSegments<typeof schema, T>>;

which we can use to collect paths and group them:

type NumericPath = CollectPathsByType<number>;

type BooleanPath = CollectPathsByType<boolean>;

type StringPath = CollectPathsByType<string>;

type ConfigOptionPath = NumericPath | BooleanPath | StringPath;

3. Mapping paths to their return types

To match paths to their return types:

type ToReturnType<T extends ConfigOptionPath> = T extends NumericPath
  ? number
  : T extends BooleanPath
  ? boolean
  : T extends StringPath
  ? string
  : unknown;

4. Extending the interface with the map

Finally, we can use module augmentation to overload get on the Config interface:

declare module 'convict' {
  interface Config<T> {
    get<Path extends ConfigOptionPath>(path: Path): ToReturnType<Path>;
  }
}

An option whose format is a string literal array must return a type that is a union of those string literals:

protocol: {
	format: ['http', 'https'], // return type should be 'http' | 'https'
	default: 'http',
	env: 'N8N_PROTOCOL',
	doc: 'HTTP Protocol via which n8n can be reached',
},

This poses a challenge. In ToReturnType above, T extends StringPath will resolve the return type of a string literal array to string, when it should resolve it to the union of its exact string literals. To resolve the return type, we need to create and index into a map like this:

type PathToStringLiteralUnionMap = {
  protocol: 'http' | 'https';
  'database.type': 'sqlite' | 'mariadb' | 'mysqldb' | 'postgresdb';
  'executions.process': 'main' | 'own';
  'executions.mode': 'regular' | 'queue';
  'executions.saveDataOnError': 'all' | 'none';
  'executions.saveDataOnSuccess': 'all' | 'none';
};

We begin by identifying all the string literal array paths. To do this, we can tag every string literal array in the schema with an as const assertion, which directs the compiler to infer the narrowest type possible for the value.

protocol: {
	format: ['http', 'https'] as const, // TS infers 'http' | 'https' instead of string[]
	default: 'http',
	env: 'N8N_PROTOCOL',
	doc: 'HTTP Protocol via which n8n can be reached',
},

Next, we create a version of GetPathSegments that...

  • filters for the ReadonlyArray inferred by the as const assertion, and
  • collects the string literal union that the compiler infers for the string literal.
type GetPathSegmentsWithUnions<T> = T extends ReadonlyArray<infer C>
  ? [C]
  : {
      [K in ValidKeys<T>]: [K, ...GetPathSegmentsWithUnions<T[K]>];
    }[ValidKeys<T>];

type Result = GetPathSegmentsWithUnions<typeof schema>;

Compare the new version above with the original version below — the only differences are the hardcoded filter type instead of the second generic and the inferral of C to add it at the tail of the collecting array.

type GetPathSegments<Traversable, Filter> = Traversable extends Filter
  ? []
  : {
      [K in ValidKeys<Traversable>]: [
        K,
        ...GetPathSegments<Traversable[K], Filter>
      ];
    }[ValidKeys<Traversable>];

The Result of GetPathSegmentsWithUnions<typeof schema> is

["database", "type", "format", "sqlite" | "mariadb" | "mysqldb" | "postgresdb"] | ["executions", "process", "format", "main" | "own"] | ["executions", "mode", "format", "regular" | "queue"] | <etc>

This is a union of arrays, each item being the segment of the path to a config option that is a string literal array, except the last item — this last item is the string union for the string literal array.

Next we reuse ToDottedPath to prepare the path and transform the union of arrays into a union of key-value pairs. This transformation is necessary so that we can consolidate all the pairs into a single map.

type ToPathUnionPair<T> = T extends [...infer Path, infer Union]
  ? { path: ToDottedPath<Path>; union: Union }
  : never;

type Result = ToPathUnionPair<PathSegmentsAndUnions<typeof schema>>;

Since we know that the path segments make up all the items in the array except the last one, we can grab the path segments using the ... rest operator, and then grab the union itself. Having both, we build the path with ToDottedPath<T> and we set the path and union in a mini-object. Feeding the union of arrays into ToPathUnionPair will output a union of path-union mini-objects, ready to be consolidated into a single map.

Result here is:

{ path: "database.type"; union: "sqlite" | "mariadb" | "mysqldb" | "postgresdb" } | { path: "executions.process"; union: "main" | "own" } | { path: "executions.mode"; union: "regular" | "queue"; } | <etc>

Finally, to consolidate them, we create a mapped type where...

  • the key of the mapped type is the path itself at the path key in the mini-object, e.g. for { path: "executions.process"; union: "main" | "own" }, the key will be "executions.process", and
  • the value of the mapped type is the result of applying the TypeScript's built-in Extract utility to get the { union: string } part of the object, and we immediately index into it with 'union', yielding the union itself.

To elaborate, Extract<T, U> extracts from T the members that are assignable to U, so we use Extract<T, { path: K }> to find, in the path-union pairs, the pair having the path that is the key being looped over, and we access the union for that path by indexing into its union key.

type ToStringLiteralMap<T extends { path: string; union: string }> = {
  [Path in T['path']]: Extract<T, { path: Path }>['union'];
};

type StringLiteralMap = ToStringLiteralMap<
  ToPathUnionPair<PathSegmentsAndUnions<typeof schema>>
>;

The Result is the map we need:

{
  protocol: 'http' | 'https';
  'database.type': 'sqlite' | 'mariadb' | 'mysqldb' | 'postgresdb';
  'executions.process': 'main' | 'own';
  'executions.mode': 'regular' | 'queue';
  'executions.saveDataOnError': 'all' | 'none';
  'executions.saveDataOnSuccess': 'all' | 'none';
};

With this map, we can complete the the correspondence between all path groups and return types:

type ToReturnType<T extends ConfigOptionPath> = T extends NumericPath
  ? number
  : T extends BooleanPath
  ? boolean
  : T extends StringLiteralArrayPath
  ? StringLiteralMap[T]
  : T extends StringPath
  ? string
  : unknown;