Glob with named back-reference support.

Named glob (NGlob) patterns are an advanced form of pattern matching that supports back referencing of previously matched substrings.

It has the following use cases:

  • Single named wildcard: By default, the wildcard ${*name} is a placeholder for any string. One may also specify a pattern for ${*name} through optional arguments. For example:

    ngs = NGlobSingle("feedback_${*idx}.md", idx="[0-9][0-9][0-9]")

    Unlike ordinary wildcards, named wildcards never match an empty string.

  • Consistency within one pattern: If a pattern uses the same named globs multiple times, the matching substring must also be consistent. For example:

    ngs = NGlobSingle("archive_${*idx}/feedback_${*idx}.md", idx="[0-9][0-9][0-9]")

    These would match:

    • archive_042/
    • archive_777/

    This won’t match:

    • archive_042/
  • Consistency across multiple patterns: One can define multiple patterns and enforce consistency between their matches. For example:

    ngm = NGlobMulti("feedback_${*idx}.md", "report_${*idx}.pdf", idx="[0-9][0-9][0-9]")

    This will produce pairs of matches (provided the files are present). For example, the following would match:

    • with report_001.pdf
    • with report_123.pdf

    The following won’t be in the results, despite the fact that the files exist:

    • with report_123.pdf
  • Conventional (recursive) glob wildcards are also allowed and are called “anonymous wildcards” to clarify the distinction from named wildcards.


A set of matches corresponding sharing consistent values for named wildcards.

The matching files can be accessed by integer indexing or through the files attribute:

assert match[0] == match.files[0]

The substring matching the named wildcards can be accessed as attributes. For example, the substring matching a named wildcard foo is accessed as follows:


When you expect only a single matching file, then the single attribute can be used. It will raise an exception when there are zero or multiple matches:


In the unfortunate case that your named wildcards are named single, files or mapping, you can access their values through the mapping attribute:

class NGlobMatch:
    """A set of matches corresponding sharing consistent values for named wildcards.

    The matching files can be accessed by integer indexing or through the `files` attribute:

    assert match[0] == match.files[0]

    The substring matching the named wildcards can be accessed as attributes.
    For example, the substring matching a named wildcard `foo` is accessed as follows:


    When you expect only a single matching file, then the `single` attribute can be used.
    It will raise an exception when there are zero or multiple matches:


    In the unfortunate case that your named wildcards are named `single`, `files` or `mapping`,
    you can access their values through the `mapping` attribute:


    _mapping: dict[str, str]
    _files: list[Path | list[Path]]

    def __getitem__(self, idx) -> Path | list[Path]:
        return self._files[idx]

    def __getattr__(self, name) -> str:
            return self._mapping[name]
        except KeyError as exc:
            raise AttributeError(f"'NGlobMatch' object has no attribute '{name}'") from exc

    def mapping(self) -> dict[str, str]:
        """Dictionary with `(wildcard_name, substring)` items."""
        return self._mapping

    def files(self) -> list[Path | list[Path]]:
        """Matching files, all having consistent substrings matching the named wildcards.

        Each item corresponds to a pattern in `NGlobMulti.patterns`.
        If a pattern has anonymous wildcards,
        the item itself is a list of all files matching the pattern,
        If the pattern contains no anonymous wildcards,
        the corresponding item in the returned list is a single path.
        return self._files

    def single(self) -> Path:
        """A single path if there is exactly one match, raises an error otherwise."""
        if len(self._files) == 0:
            raise ValueError("No files matched.")
        if len(self._files) > 1:
            raise ValueError("Multiple files matched.")
        result = self._files[0]
        if isinstance(result, list):
            if len(result) == 0:
                raise ValueError("No files matched.")
            if len(result) > 1:
                raise ValueError("Multiple files matched.")
            result = result[0]
        return result

files property

Matching files, all having consistent substrings matching the named wildcards.

Each item corresponds to a pattern in NGlobMulti.patterns. If a pattern has anonymous wildcards, the item itself is a list of all files matching the pattern, If the pattern contains no anonymous wildcards, the corresponding item in the returned list is a single path.

mapping property

Dictionary with (wildcard_name, substring) items.

single property

A single path if there is exactly one match, raises an error otherwise.


A sequence of Named Glob patterns for which consistent matches are collected.

class NGlobMulti:
    """A sequence of Named Glob patterns for which consistent matches are collected."""

    _nglob_singles: tuple[NGlobSingle, ...] = attrs.field()
    _subs: dict[str, str] = attrs.field(init=False)
    _used_names: tuple[str, ...] = attrs.field(init=False)
    _has_wildcards: bool = attrs.field(init=False)
    _results: dict[tuple[str, ...], list[set[Path]]] = attrs.field(init=False, factory=dict)

    def _default_subs(self):
        if len(self._nglob_singles) == 0:
            return {}
        subs = self._nglob_singles[0].subs
        for other in self._nglob_singles[1:]:
            if other.subs != subs:
                raise ValueError("Searches in one NGlobMulti must use the same substitutions")
            other._subs = subs
        return subs

    def _default_used_names(self) -> tuple[str, ...]:
        result = set()
        for ngs in self._nglob_singles:
        return tuple(sorted(result))

    def _default_has_wildcards(self) -> bool:
        for ngs in self._nglob_singles:
            if has_anonymous_wildcards(ngs.pattern):
                return True
        for name in self._used_names:
            pattern = self._subs.get(name)
            if pattern is None:
                return True
            if has_anonymous_wildcards(pattern):
                return True
        return False

    def from_patterns(cls, patterns: Iterable[str], subs: dict[str, str] | None = None) -> Self:
        """Create a new instance for given patterns without any results.

            Named Glob patterns.
            Results will be constrained to have consistently matching substrings
            for the named wildcards appearing in all the patterns.
            Optional anonymous glob patterns for the named patterns.
            When a name is not present, the wildcard `*` is used for this name.
        if isinstance(patterns, str):
            raise TypeError("The patterns argument cannot be a string")
        if not all(isinstance(pattern, str) for pattern in patterns):
            raise TypeError(f"The patterns must be a list of strings, got {patterns}")
        if subs is None:
            subs = {}
            if not all(isinstance(name, str) for name in subs):
                raise TypeError(f"The subs keys must be a list of strings, got {patterns}")
            if not all(isinstance(value, str) for value in subs.values()):
                raise TypeError(f"The subs values must be a list of strings, got {patterns}")
        return cls(tuple(NGlobSingle(str(pattern), subs) for pattern in patterns))

    def nglob_singles(self) -> tuple[NGlobSingle, ...]:
        """The list of NGlobSingle instances, one for each pattern.

        These instances collect (partial) matches before any consistency is imposed between
        the substrings matching the same name in different patterns.
        return self._nglob_singles

    def patterns(self):
        """The list of Named Glob patterns."""
        return [ngs.pattern for ngs in self._nglob_singles]

    def subs(self) -> dict[str, str]:
        """User-defined glob patterns for the named wildcards.

        When a name is not present, `*` is used.
        return self._subs

    def used_names(self) -> tuple[str, ...]:
        """The names used across all the named glob patterns."""
        return self._used_names

    def has_wildcards(self) -> bool:
        """True if any named or anonymous wildcards are present in the patterns."""
        return self._has_wildcards

    def results(self) -> dict[tuple[str, ...], list[set[Path]]]:
        """A dictionary with all matches collected so far.

        A key in this dictionary is a tuple of substrings named wildcards,
        using the same order as the `used_names` attribute.

        A value is a list of sets of paths.
        Each item in the list is a set of matching filenames for the corresponding
        pattern from the `patterns` attribute, whose named wildcards match the substrings
        of the key.

        The results can be extended with the `extend` and `glob` methods.
        Conversely, results can be removed with the `reduce` method.
        return self._results

    def _iter_consistent(
        self, criteria: dict[str, str], full_paths: list | int
    ) -> Iterator[tuple[str, ...], list[list[Path]]]:
        """Iterate over (partial) matching substrings and corresponding paths.

            A dictionary mapping named wildcards to matching substrings.
            If this is a list, it contains lists of paths matching the patterns
            in of the `patterns` attribute with substrings consistent with those in
            the criteria argument.
            Note that this is a recursive iterator, so full_paths may contain fewer
            items than there are patterns when the recursion has not reached it full
            depth yet.
            If this is an integer, it is in index referring to the item in the `patterns`
            to identify the current pattern being processed.
        start = full_paths if isinstance(full_paths, int) else len(full_paths)
        if start == len(self._nglob_singles):
            # We're in the deepest recursion: yield a result.
            yield tuple(criteria[name] for name in self._used_names), full_paths
            # Recursion in progress...
            ngs = self._nglob_singles[start]
            for new_values, paths in ngs.results.items():
                next_criteria = criteria.copy()
                # Check if named wildcards are consistent with the matching paths so far.
                for name, new_value in zip(ngs.used_names, new_values, strict=False):
                    value = next_criteria.get(name)
                    if value is None:
                        next_criteria[name] = new_value
                    elif value != new_value:
                        # Inconsistent matches for named wildcards in different patterns.
                        # This cannot produce a useful result.
                        next_criteria = None
                if next_criteria is not None:
                    # Consistency can still be imposed, so enter the next recursion...
                    next_full_paths = (
                        start + 1 if isinstance(full_paths, int) else [*full_paths, paths]
                    yield from self._iter_consistent(next_criteria, next_full_paths)

    def _extend_consistent(self, i: int, values: tuple[str, ...]):
        """Extend the results of this instance, given an added combination of matching substrings.

            The integer index of the pattern in the `patterns` attribute being processed.
            A new set of substrings matching the named wildcards.
        criteria = dict(zip(self._nglob_singles[i].used_names, values, strict=False))
        new_items = list(self._iter_consistent(criteria, []))
        for full_values, full_paths in new_items:
            self._results[full_values] = full_paths

    def _reduce_consistent(self, i: int, values: tuple[str, ...]):
        """Return the results of this instance, given a removed combination of matching substrings.
            The integer index of the pattern in the `patterns` attribute being processed.
            A new set of substrings matching the named wildcards.
        criteria = dict(zip(self._nglob_singles[i].used_names, values, strict=False))
        old_items = list(self._iter_consistent(criteria, 0))
        for full_values, _ in old_items:
            del self._results[full_values]

    def extend(self, paths: Iterable[str]):
        """Try to extend the results by searching for matches in the given list of paths."""
        if isinstance(paths, str):
            raise TypeError("The paths argument cannot be a string.")
        for i, ngs in enumerate(self._nglob_singles):
            for values in ngs.extend(paths):
                self._extend_consistent(i, values)

    def reduce(self, paths: Iterable[str]):
        """Drop results by eliminating the provided paths."""
        if isinstance(paths, str):
            raise TypeError("The paths argument cannot be a string.")
        for i, ngs in enumerate(self._nglob_singles):
            for values in ngs.reduce(paths):
                self._reduce_consistent(i, values)

    def glob(self):
        """Extend the results with paths found by the built-in glob function."""
        for i, ngs in enumerate(self._nglob_singles):
            for values in ngs.glob():
                self._extend_consistent(i, values)

    def deepcopy(self):
        """Return an independent copy."""
        return copy.deepcopy(self)

    def equals(self, other: "NGlobMulti") -> bool:
        """Compare the results."""
        return self._results == other._results

    # Convenience methods

    def matches(self) -> Iterator[NGlobMatch]:
        """Iterate over combinations of files that consistently match all patterns.

        This offers a more convenient interface of the `results` attribute.

            An instance of NGlobMatch, which contains the substrings matching the named wildcards
            and the corresponding lists of paths.
        for values, path_sets in sorted(self._results.items()):
            mapping = dict(zip(self._used_names, values, strict=False))
            files = [
                (sorted(paths) if has_anonymous_wildcards(ngs.pattern) else next(iter(paths)))
                for ngs, paths in zip(self._nglob_singles, path_sets, strict=False)
            yield NGlobMatch(mapping, files)

    def files(self) -> tuple[Path, ...]:
        """Return a tuple of sorted files that match the individual patterns.

        No constraints between multiple patterns are imposed and files may belong to partial
        and inconsistent full matches.
        result = set()
        for ngs in self._nglob_singles:
            for path_set in ngs.results.values():
        return tuple(sorted(result))

    def single(self) -> Path:
        """Return the single matching path.

            If there is not exactly one match.
        files = self.files()
        if len(files) != 1:
            raise ValueError(f"There are {len(files)} matches, not just one.")
        return files[0]

    def __bool__(self):
        """True when there are some items in the `results` attribute."""
        return len(self.results) > 0

    def __iter__(self) -> Iterator[str | NGlobMatch]:
        """Iterates over `self.matches` if there are named wildcards, else over `self.files`."""
        if len(self._used_names) > 0:
            return self.matches()
        return iter(self.files())

    def may_match(self, path):
        """Return True if the path matches one of the NGlobSingle instances.

        This means that it may be a path contributing to a consistent match of NGlobMulti.
        When added, it will show up in the result of the `files` method,
        and it may affect the outcome of the `matches` method.
        return any(ngs.regex.fullmatch(path) for ngs in self._nglob_singles)

    def may_change(self, deleted: set[str], added: set[str]) -> bool:
        """Determine whether the results may change (later) after deleting or adding files.

            Set of files to be deleted.
            Set of files to be added.

            True if the NGlobMulti results may change.
            (It may require additional additions and deletions to get any effect,
            but cannot be excluded that the provided deletions and updates play a role in it.)
        added_new = added.copy()
        for ngs in self._nglob_singles:
            for paths in ngs.results.values():
                if not deleted.isdisjoint(paths):
                    return True
        for ngs in self._nglob_singles:
            for path in added_new:
                if ngs.regex.fullmatch(path):
                    return True
        return False

    def will_change(self, deleted: Collection[str], added: Collection[str]) -> Self | None:
        """Determine whether the results will change after deleting or adding files.

            Set of files to be deleted.
            Set of files to be added.

            a new modified copy with the changes if any.
            None otherwise.
        evolved = self.deepcopy()
        return None if evolved.equals(self) else evolved

has_wildcards property

True if any named or anonymous wildcards are present in the patterns.

nglob_singles property

The list of NGlobSingle instances, one for each pattern.

These instances collect (partial) matches before any consistency is imposed between the substrings matching the same name in different patterns.

patterns property

The list of Named Glob patterns.

results property

A dictionary with all matches collected so far.

A key in this dictionary is a tuple of substrings named wildcards, using the same order as the used_names attribute.

A value is a list of sets of paths. Each item in the list is a set of matching filenames for the corresponding pattern from the patterns attribute, whose named wildcards match the substrings of the key.

The results can be extended with the extend and glob methods. Conversely, results can be removed with the reduce method.

subs property

User-defined glob patterns for the named wildcards.

When a name is not present, * is used.

used_names property

The names used across all the named glob patterns.


True when there are some items in the results attribute.

def __bool__(self):
    """True when there are some items in the `results` attribute."""
    return len(self.results) > 0


Iterates over self.matches if there are named wildcards, else over self.files.

def __iter__(self) -> Iterator[str | NGlobMatch]:
    """Iterates over `self.matches` if there are named wildcards, else over `self.files`."""
    if len(self._used_names) > 0:
        return self.matches()
    return iter(self.files())

_extend_consistent(i, values)

Extend the results of this instance, given an added combination of matching substrings.


  • i (int) –

    The integer index of the pattern in the patterns attribute being processed.

  • values (tuple[str, ...]) –

    A new set of substrings matching the named wildcards.

def _extend_consistent(self, i: int, values: tuple[str, ...]):
    """Extend the results of this instance, given an added combination of matching substrings.

        The integer index of the pattern in the `patterns` attribute being processed.
        A new set of substrings matching the named wildcards.
    criteria = dict(zip(self._nglob_singles[i].used_names, values, strict=False))
    new_items = list(self._iter_consistent(criteria, []))
    for full_values, full_paths in new_items:
        self._results[full_values] = full_paths

_iter_consistent(criteria, full_paths)

Iterate over (partial) matching substrings and corresponding paths.


  • criteria (dict[str, str]) –

    A dictionary mapping named wildcards to matching substrings.

  • full_paths (list | int) –

    If this is a list, it contains lists of paths matching the patterns in of the patterns attribute with substrings consistent with those in the criteria argument. Note that this is a recursive iterator, so full_paths may contain fewer items than there are patterns when the recursion has not reached it full depth yet. If this is an integer, it is in index referring to the item in the patterns to identify the current pattern being processed.

def _iter_consistent(
    self, criteria: dict[str, str], full_paths: list | int
) -> Iterator[tuple[str, ...], list[list[Path]]]:
    """Iterate over (partial) matching substrings and corresponding paths.

        A dictionary mapping named wildcards to matching substrings.
        If this is a list, it contains lists of paths matching the patterns
        in of the `patterns` attribute with substrings consistent with those in
        the criteria argument.
        Note that this is a recursive iterator, so full_paths may contain fewer
        items than there are patterns when the recursion has not reached it full
        depth yet.
        If this is an integer, it is in index referring to the item in the `patterns`
        to identify the current pattern being processed.
    start = full_paths if isinstance(full_paths, int) else len(full_paths)
    if start == len(self._nglob_singles):
        # We're in the deepest recursion: yield a result.
        yield tuple(criteria[name] for name in self._used_names), full_paths
        # Recursion in progress...
        ngs = self._nglob_singles[start]
        for new_values, paths in ngs.results.items():
            next_criteria = criteria.copy()
            # Check if named wildcards are consistent with the matching paths so far.
            for name, new_value in zip(ngs.used_names, new_values, strict=False):
                value = next_criteria.get(name)
                if value is None:
                    next_criteria[name] = new_value
                elif value != new_value:
                    # Inconsistent matches for named wildcards in different patterns.
                    # This cannot produce a useful result.
                    next_criteria = None
            if next_criteria is not None:
                # Consistency can still be imposed, so enter the next recursion...
                next_full_paths = (
                    start + 1 if isinstance(full_paths, int) else [*full_paths, paths]
                yield from self._iter_consistent(next_criteria, next_full_paths)

_reduce_consistent(i, values)

Return the results of this instance, given a removed combination of matching substrings.


  • i (int) –

    The integer index of the pattern in the patterns attribute being processed.

  • values (tuple[str, ...]) –

    A new set of substrings matching the named wildcards.

def _reduce_consistent(self, i: int, values: tuple[str, ...]):
    """Return the results of this instance, given a removed combination of matching substrings.
        The integer index of the pattern in the `patterns` attribute being processed.
        A new set of substrings matching the named wildcards.
    criteria = dict(zip(self._nglob_singles[i].used_names, values, strict=False))
    old_items = list(self._iter_consistent(criteria, 0))
    for full_values, _ in old_items:
        del self._results[full_values]


Return an independent copy.

def deepcopy(self):
    """Return an independent copy."""
    return copy.deepcopy(self)


Compare the results.

def equals(self, other: "NGlobMulti") -> bool:
    """Compare the results."""
    return self._results == other._results


Try to extend the results by searching for matches in the given list of paths.

def extend(self, paths: Iterable[str]):
    """Try to extend the results by searching for matches in the given list of paths."""
    if isinstance(paths, str):
        raise TypeError("The paths argument cannot be a string.")
    for i, ngs in enumerate(self._nglob_singles):
        for values in ngs.extend(paths):
            self._extend_consistent(i, values)


Return a tuple of sorted files that match the individual patterns.

No constraints between multiple patterns are imposed and files may belong to partial and inconsistent full matches.

def files(self) -> tuple[Path, ...]:
    """Return a tuple of sorted files that match the individual patterns.

    No constraints between multiple patterns are imposed and files may belong to partial
    and inconsistent full matches.
    result = set()
    for ngs in self._nglob_singles:
        for path_set in ngs.results.values():
    return tuple(sorted(result))

from_patterns(patterns, subs=None) classmethod

Create a new instance for given patterns without any results.


  • patterns (Iterable[str]) –

    Named Glob patterns. Results will be constrained to have consistently matching substrings for the named wildcards appearing in all the patterns.

  • subs (dict[str, str] | None, default: None ) –

    Optional anonymous glob patterns for the named patterns. When a name is not present, the wildcard * is used for this name.

def from_patterns(cls, patterns: Iterable[str], subs: dict[str, str] | None = None) -> Self:
    """Create a new instance for given patterns without any results.

        Named Glob patterns.
        Results will be constrained to have consistently matching substrings
        for the named wildcards appearing in all the patterns.
        Optional anonymous glob patterns for the named patterns.
        When a name is not present, the wildcard `*` is used for this name.
    if isinstance(patterns, str):
        raise TypeError("The patterns argument cannot be a string")
    if not all(isinstance(pattern, str) for pattern in patterns):
        raise TypeError(f"The patterns must be a list of strings, got {patterns}")
    if subs is None:
        subs = {}
        if not all(isinstance(name, str) for name in subs):
            raise TypeError(f"The subs keys must be a list of strings, got {patterns}")
        if not all(isinstance(value, str) for value in subs.values()):
            raise TypeError(f"The subs values must be a list of strings, got {patterns}")
    return cls(tuple(NGlobSingle(str(pattern), subs) for pattern in patterns))


Extend the results with paths found by the built-in glob function.

def glob(self):
    """Extend the results with paths found by the built-in glob function."""
    for i, ngs in enumerate(self._nglob_singles):
        for values in ngs.glob():
            self._extend_consistent(i, values)


Iterate over combinations of files that consistently match all patterns.

This offers a more convenient interface of the results attribute.


  • nglob_match

    An instance of NGlobMatch, which contains the substrings matching the named wildcards and the corresponding lists of paths.

def matches(self) -> Iterator[NGlobMatch]:
    """Iterate over combinations of files that consistently match all patterns.

    This offers a more convenient interface of the `results` attribute.

        An instance of NGlobMatch, which contains the substrings matching the named wildcards
        and the corresponding lists of paths.
    for values, path_sets in sorted(self._results.items()):
        mapping = dict(zip(self._used_names, values, strict=False))
        files = [
            (sorted(paths) if has_anonymous_wildcards(ngs.pattern) else next(iter(paths)))
            for ngs, paths in zip(self._nglob_singles, path_sets, strict=False)
        yield NGlobMatch(mapping, files)

may_change(deleted, added)

Determine whether the results may change (later) after deleting or adding files.


  • deleted (set[str]) –

    Set of files to be deleted.

  • added (set[str]) –

    Set of files to be added.


  • may_change

    True if the NGlobMulti results may change. (It may require additional additions and deletions to get any effect, but cannot be excluded that the provided deletions and updates play a role in it.)

def may_change(self, deleted: set[str], added: set[str]) -> bool:
    """Determine whether the results may change (later) after deleting or adding files.

        Set of files to be deleted.
        Set of files to be added.

        True if the NGlobMulti results may change.
        (It may require additional additions and deletions to get any effect,
        but cannot be excluded that the provided deletions and updates play a role in it.)
    added_new = added.copy()
    for ngs in self._nglob_singles:
        for paths in ngs.results.values():
            if not deleted.isdisjoint(paths):
                return True
    for ngs in self._nglob_singles:
        for path in added_new:
            if ngs.regex.fullmatch(path):
                return True
    return False


Return True if the path matches one of the NGlobSingle instances.

This means that it may be a path contributing to a consistent match of NGlobMulti. When added, it will show up in the result of the files method, and it may affect the outcome of the matches method.

def may_match(self, path):
    """Return True if the path matches one of the NGlobSingle instances.

    This means that it may be a path contributing to a consistent match of NGlobMulti.
    When added, it will show up in the result of the `files` method,
    and it may affect the outcome of the `matches` method.
    return any(ngs.regex.fullmatch(path) for ngs in self._nglob_singles)


Drop results by eliminating the provided paths.

def reduce(self, paths: Iterable[str]):
    """Drop results by eliminating the provided paths."""
    if isinstance(paths, str):
        raise TypeError("The paths argument cannot be a string.")
    for i, ngs in enumerate(self._nglob_singles):
        for values in ngs.reduce(paths):
            self._reduce_consistent(i, values)


Return the single matching path.


  • ValueError

    If there is not exactly one match.

def single(self) -> Path:
    """Return the single matching path.

        If there is not exactly one match.
    files = self.files()
    if len(files) != 1:
        raise ValueError(f"There are {len(files)} matches, not just one.")
    return files[0]

will_change(deleted, added)

Determine whether the results will change after deleting or adding files.


  • deleted (Collection[str]) –

    Set of files to be deleted.

  • added (Collection[str]) –

    Set of files to be added.


  • evolved

    a new modified copy with the changes if any. None otherwise.

def will_change(self, deleted: Collection[str], added: Collection[str]) -> Self | None:
    """Determine whether the results will change after deleting or adding files.

        Set of files to be deleted.
        Set of files to be added.

        a new modified copy with the changes if any.
        None otherwise.
    evolved = self.deepcopy()
    return None if evolved.equals(self) else evolved


Named glob with a single pattern.

class NGlobSingle:
    """Named glob with a single pattern."""

    _pattern: str = attrs.field()
    _subs: dict[str, str] = attrs.field(factory=dict)
    _results: dict[tuple[str, ...], set[Path]] = attrs.field(factory=dict)
    _used_names: tuple[str, ...] = attrs.field(init=False)
    _glob_pattern: str = attrs.field(init=False)
    _regex: re.Pattern = attrs.field(init=False)

    def _default_used_names(self) -> tuple[str, ...]:
        return tuple(sorted(set(iter_wildcard_names(self._pattern))))

    def _default_glob(self) -> str:
        return convert_nglob_to_glob(self._pattern, self._subs)

    def _default_regex(self) -> re.Pattern:
        return re.compile(convert_nglob_to_regex(self._pattern, self._subs))

    def pattern(self) -> str:
        """The Named Glob pattern used to match filenames."""
        return self._pattern

    def subs(self) -> dict[str, str]:
        """User-defined glob patterns for the named wildcards.

        When a name is not present, `*` is used.
        return self._subs

    def results(self) -> dict[tuple[str, ...], set[Path]]:
        """All matching files, grouped by substrings matching the named wildcards.

        The keys of the `results` dictionary are tuples with the substrings,
        matching the respective named wildcards in the `used_names` tuple.
        The values are sets with matching paths.
        return self._results

    def used_names(self) -> tuple[str, ...]:
        """A tuple of named wildcards present in the pattern."""
        return self._used_names

    def glob_pattern(self) -> str:
        """The conversion of the named glob to a (more general) conventional glob pattern."""
        return self._glob_pattern

    def regex(self) -> re.Pattern:
        """The conversion of the named glob to a regular expression."""
        return self._regex

    def _loop_matches(
        self, paths: Iterable[str]
    ) -> Iterator[tuple[tuple[str, ...], set[Path], Path]]:
        """Low-level iterator used by the `extend` and `reduce` methods.

        The paths are tested one by one against the regular expression.
        In case of a hit, it yields a tuple with the following three items:

        - `values`: the substrings matching the named wildcards.
        - `path_set`: the current set of paths associated with the combination of substrings.
        - `path`: a `Path` instance of the matching path.
        for path in paths:
            match_ = self._regex.fullmatch(path)
            if match_ is not None:
                mapping = match_.groupdict()
                values = tuple(mapping[name] for name in self._used_names)
                paths = self._results.get(values)
                if paths is None:
                    paths = set()
                    self._results[values] = paths
                yield values, paths, Path(path)
                if len(paths) == 0:
                    del self._results[values]

    def extend(self, paths: Iterable[str]) -> Iterator[tuple[str, ...]]:
        """Add matching paths from the given list paths.

            A tuple with substring matching the named wildcards,
            only this combination of names was not present yet.
        for values, path_set, path in self._loop_matches(paths):
            if len(path_set) == 0:
                yield values

    def reduce(self, paths: Iterable[str]) -> Iterator[tuple[str, ...]]:
        """Remove matching paths from given list paths.

            A tuple with deleted substring matching the named wildcards,
            only if the last matching paths were removed.
        for values, path_set, path in self._loop_matches(paths):
            if len(path_set) > 0:
                if len(path_set) == 0:
                    yield values

    def glob(self) -> Iterator[tuple[str, ...]]:
        """Extend the results with paths obtained through Python's built-in glob module.

            A tuple with substring matching the named wildcards,
            only this combination of names was not present yet.
        paths = []
        for path in glob.iglob(self._glob_pattern, recursive=True, include_hidden=True):
            path = Path(path)
            if path.is_dir():
                path = path / ""
        yield from self.extend(paths)

glob_pattern property

The conversion of the named glob to a (more general) conventional glob pattern.

pattern property

The Named Glob pattern used to match filenames.

regex property

The conversion of the named glob to a regular expression.

results property

All matching files, grouped by substrings matching the named wildcards.

The keys of the results dictionary are tuples with the substrings, matching the respective named wildcards in the used_names tuple. The values are sets with matching paths.

subs property

User-defined glob patterns for the named wildcards.

When a name is not present, * is used.

used_names property

A tuple of named wildcards present in the pattern.


Low-level iterator used by the extend and reduce methods.

The paths are tested one by one against the regular expression. In case of a hit, it yields a tuple with the following three items:

  • values: the substrings matching the named wildcards.
  • path_set: the current set of paths associated with the combination of substrings.
  • path: a Path instance of the matching path.
def _loop_matches(
    self, paths: Iterable[str]
) -> Iterator[tuple[tuple[str, ...], set[Path], Path]]:
    """Low-level iterator used by the `extend` and `reduce` methods.

    The paths are tested one by one against the regular expression.
    In case of a hit, it yields a tuple with the following three items:

    - `values`: the substrings matching the named wildcards.
    - `path_set`: the current set of paths associated with the combination of substrings.
    - `path`: a `Path` instance of the matching path.
    for path in paths:
        match_ = self._regex.fullmatch(path)
        if match_ is not None:
            mapping = match_.groupdict()
            values = tuple(mapping[name] for name in self._used_names)
            paths = self._results.get(values)
            if paths is None:
                paths = set()
                self._results[values] = paths
            yield values, paths, Path(path)
            if len(paths) == 0:
                del self._results[values]


Add matching paths from the given list paths.


  • values

    A tuple with substring matching the named wildcards, only this combination of names was not present yet.

def extend(self, paths: Iterable[str]) -> Iterator[tuple[str, ...]]:
    """Add matching paths from the given list paths.

        A tuple with substring matching the named wildcards,
        only this combination of names was not present yet.
    for values, path_set, path in self._loop_matches(paths):
        if len(path_set) == 0:
            yield values


Extend the results with paths obtained through Python’s built-in glob module.


  • values

    A tuple with substring matching the named wildcards, only this combination of names was not present yet.

def glob(self) -> Iterator[tuple[str, ...]]:
    """Extend the results with paths obtained through Python's built-in glob module.

        A tuple with substring matching the named wildcards,
        only this combination of names was not present yet.
    paths = []
    for path in glob.iglob(self._glob_pattern, recursive=True, include_hidden=True):
        path = Path(path)
        if path.is_dir():
            path = path / ""
    yield from self.extend(paths)


Remove matching paths from given list paths.


  • values

    A tuple with deleted substring matching the named wildcards, only if the last matching paths were removed.

def reduce(self, paths: Iterable[str]) -> Iterator[tuple[str, ...]]:
    """Remove matching paths from given list paths.

        A tuple with deleted substring matching the named wildcards,
        only if the last matching paths were removed.
    for values, path_set, path in self._loop_matches(paths):
        if len(path_set) > 0:
            if len(path_set) == 0:
                yield values

convert_nglob_to_glob(pattern, subs=None)

Convert nglob wildcards to ordinary ones, compatible with builtin glob and fnmatch modules.


  • pattern (str) –

    A string with named wildcards.

  • subs (dict[str, str] | None, default: None ) –

    A dictionary mapping names to glob patterns. If a name is not present, * is used as default.


  • pattern

    A conventional wildcard string, without the constraint that named wildcards must correspond. Where possible, neighboring wildcards are merged into one.

def convert_nglob_to_glob(pattern: str, subs: dict[str, str] | None = None) -> str:
    """Convert nglob wildcards to ordinary ones, compatible with builtin glob and fnmatch modules.

        A string with named wildcards.
        A dictionary mapping names to glob patterns.
        If a name is not present, `*` is used as default.

        A conventional wildcard string, without the constraint that named wildcards must correspond.
        Where possible, neighboring wildcards are merged into one.
    if subs is None:
        subs = {}
    # Split in text, wildcard and named wildcard fragments.
    parts = []
    # The odd-numbered indices match a (named) wildcard.
    for i, part in enumerate(RE_NAMED_WILD.split(pattern)):
        if i % 2 == 1 and part.startswith("${*"):
            # Split the substituted named wildcards once more.
            parts.extend(RE_NAMED_WILD.split(subs.get(part[3:-1], "*")))
            # No substitution, so no additional splitting required.
    # Remove empty strings due to neighboring wildcards with no normal text in between.
    parts = [part for part in parts if part != ""]
    # Make sure no asterisks are glued together and a few other simplifications.
    texts = []
    for part in parts:
        if len(texts) == 0 or part == "?":
        elif part == "*":
            if texts[-1] not in ["*", "**"]:
        elif part == "**":
            if texts[-1] in ["*"]:
                texts[-1] = "**"
            elif texts[-1] != "**":
        elif part == "**/":
            if texts[-1] in ["*"] or texts[-1] == "**":
                texts[-1] = "**/"
            elif texts[-1] != "**/":
    return "".join(texts)

convert_nglob_to_regex(pattern, subs=None, allow_names=True)

Convert a named glob pattern to a regular expressions.


  • pattern (str) –

    A string with named wildcards.

  • subs (dict[str, str] | None, default: None ) –

    A dictionary mapping names to glob patterns. If a name is not present, * is used as default.

  • allow_names (bool, default: True ) –

    When set to False, named wildcards are not allowed.


  • regex

    A regular expression string to test if a string matches the pattern. It also contains symbolic groups to extract values corresponding to named wildcards and to impose consistency when the same name appears multiple times.

def convert_nglob_to_regex(
    pattern: str, subs: dict[str, str] | None = None, allow_names: bool = True
) -> str:
    """Convert a named glob pattern to a regular expressions.

        A string with named wildcards.
        A dictionary mapping names to glob patterns.
        If a name is not present, `*` is used as default.
        When set to `False`, named wildcards are not allowed.

        A regular expression string to test if a string matches the pattern.
        It also contains symbolic groups to extract values
        corresponding to named wildcards
        and to impose consistency when the same name appears multiple times.
    if subs is None:
        subs = {}
    parts = []
    # Last non-empty part matched by re.split
    last = None
    # Names encountered so far
    encountered = set()
    for i, part in enumerate(RE_NAMED_WILD.split(pattern)):
        if i % 2 == 0:
            if len(part) > 0:
                # Not a wildcard: escape regex characters.
            # A (named) wildcard: replace with corresponding regex.
            replace = False
            regex = None
            if part == "?":
                regex = r"[^/]"
            elif part == "*":
                if last not in ["*", "**"]:
                    regex = r"[^/]*"
            elif part == "**":
                if last != "**":
                    regex = r".*"
                    if last in ["*"]:
                        replace = True
            elif part == "**/":
                if last != "**/":
                    regex = r"(?:.*/|)"
                    if last in ["*", "**"]:
                        replace = True
            elif part.startswith("[") and part.endswith("]"):
                regex = rf"[^{part[2:-1]}]" if part[1] == "!" else rf"[{part[1:-1]}]"
            elif part.startswith("${*") and part.endswith("}"):
                if not allow_names:
                    raise ValueError(f"Named wildcards not allowed in {pattern}")
                name = part[3:-1]
                if name in encountered:
                    regex = rf"(?P={name})"
                    part_regex = convert_nglob_to_regex(subs.get(name, "*"), {}, False)
                    regex = rf"(?P<{name}>{part_regex})"
                raise ValueError(f"Cannot convert wildcard to regex: {part}")
            if regex is not None and len(regex) > 0:
                if replace:
                    parts[-1] = regex
        if len(part) > 0:
            last = part

    if allow_names:
        # Post-process anonymous wildcards:
        # - when enclosed by separators, '*' and '**; do not match empty strings.
        for ipart, part in enumerate(parts):
            if (
                ipart > 0
                and ipart < len(parts) - 1
                and part.endswith("*")
                and parts[ipart - 1].endswith("/")
                and parts[ipart + 1].startswith("/")
                parts[ipart] = f"{part[:-1]}+"
        # - when the pattern ends with '*', it must also match paths with a trailing separator.
        if parts[-1] == r"[^/]*":
            if len(parts) >= 2 and parts[-2].endswith("/"):
                parts[-1] = r"[^/]+"

    return "".join(parts)


Test if a glob pattern has anonymous wildcards.

def has_anonymous_wildcards(pattern: str) -> bool:
    """Test if a glob pattern has anonymous wildcards."""
    for ipart, part in enumerate(RE_NAMED_WILD.split(pattern)):
        if ipart % 2 == 1 and not part.startswith("${*"):
            return True
    return False


Test if a glob pattern has anonymous or named wildcards.

def has_wildcards(pattern: str) -> bool:
    """Test if a glob pattern has anonymous or named wildcards."""
    return is not None


Iterate over the names of the named wildcards in a Named Glob pattern.

def iter_wildcard_names(pattern: str) -> Iterator[str]:
    """Iterate over the names of the named wildcards in a Named Glob pattern."""
    for ipart, part in enumerate(RE_NAMED_WILD.split(pattern)):
        if ipart % 2 == 1 and part.startswith("${*"):
            yield part[3:-1]