llmshield package
Subpackages
Submodules
llmshield.cloak_prompt module
Prompt cloaking module.
- Description:
This module handles the replacement of sensitive entities in prompts with secure placeholders before sending to LLMs. It maintains a mapping of placeholders to original values for later restoration.
- Functions:
cloak_prompt: Replace sensitive entities with placeholders
Note
This module is intended for internal use only. Users should interact with the LLMShield class rather than calling these functions directly.
- Author:
LLMShield by brainpolo, 2025
- llmshield.cloak_prompt.cloak_prompt(prompt: str, start_delimiter: str, end_delimiter: str, entity_map: dict[str, str] | None = None, entity_config: EntityConfig | None = None) tuple[str, dict[str, str]]
Cloak sensitive entities in prompt with selective configuration.
- Parameters:
prompt – Text to cloak entities in
start_delimiter – Opening delimiter for placeholders
end_delimiter – Closing delimiter for placeholders
entity_map – Existing placeholder mappings for consistency
entity_config – Configuration for selective entity detection
- Returns:
Tuple of (cloaked_prompt, entity_mapping)
Note
Collects all match positions from the original prompt
Sorts matches in descending order by start index
Replaces matches in one pass for optimal performance
Maintains placeholder consistency across calls
llmshield.core module
Core module for PII protection in LLM interactions.
- Description:
This module provides the main LLMShield class for protecting sensitive information in Large Language Model (LLM) interactions. It handles cloaking of sensitive entities in prompts before sending to LLMs, and uncloaking of responses to restore the original information.
- Classes:
- LLMShield: Main class orchestrating entity detection, cloaking, and
uncloaking
- Key Features:
Entity detection and protection (names, emails, numbers, etc.)
Configurable delimiters for entity placeholders
Direct LLM function integration
Zero dependencies
Example
>>> shield = LLMShield()
>>> (
... safe_prompt,
... entities,
... ) = shield.cloak(
... "Hi, I'm John (john@example.com)"
... )
>>> response = shield.uncloak(
... llm_response,
... entities,
... )
- Author:
LLMShield by brainpolo, 2025
- class llmshield.core.LLMShield(start_delimiter: str = '<', end_delimiter: str = '>', llm_func: Callable[[str], str] | Callable[[str], Generator[str, None, None]] | None = None, max_cache_size: int = 1000, entity_config: EntityConfig | None = None)
Bases:
object
Main class for LLMShield protecting sensitive information in LLMs.
Example
>>> from llmshield import ( ... LLMShield, ... ) >>> shield = LLMShield() >>> ( ... cloaked_prompt, ... entity_map, ... ) = shield.cloak( ... "Hi, I'm John Doe (john.doe@example.com)" ... ) >>> print( ... cloaked_prompt ... ) "Hi, I'm <PERSON_0> (<EMAIL_1>)" >>> llm_response = get_llm_response( ... cloaked_prompt ... ) # Your LLM call >>> original = shield.uncloak( ... llm_response, ... entity_map, ... )
- ask(stream: bool = False, messages: list[dict[str, str]] | None = None, **kwargs) str | Generator[str, None, None]
Complete end-to-end LLM interaction with automatic protection.
NOTE: If you are using a structured output, ensure that your keys do not contain PII and that any keys that may contain PII are either string, lists, or dicts. Other types like int, float, are unable to be cloaked and will be returned as is.
- Parameters:
prompt/message – Original prompt with sensitive information. This will be cloaked and passed to your LLM function. Do not pass both, and do not use any other parameter names as they are unrecognised by the shield.
stream – Whether the LLM Function is a stream or not. If True, returns a generator that yields incremental responses following the OpenAI Realtime Streaming API. If False, returns the complete response as a string. By default, this is False.
messages – List of message dictionaries for multi-turn conversations.
dictionaries (They must come in the form of a list of)
:param : :param where each dictionary has keys like “role” and “content”.: :param **kwargs: Additional arguments to pass to your LLM function,
such as: - model: The model to use (e.g., “gpt-4”) - system_prompt: System instructions - temperature: Sampling temperature - max_tokens: Maximum tokens in response etc.
! The arguments do not have to be in any specific order!
- Returns:
Uncloaked LLM response with original entities restored.
Generator[str, None, None]: If stream is True, returns a generator that yields incremental responses, following the OpenAI Realtime Streaming API.
- Return type:
str
! Regardless of the specific implementation of the LLM Function, whenever the stream parameter is true, the function will return an generator. !
- Raises:
ValueError – If no LLM function was provided during initialization, if prompt is invalid, or if both prompt and message are provided
- cloak(prompt: str | None, entity_map_param: dict[str, str] | None = None) tuple[str | None, dict[str, str]]
Cloak sensitive information in the prompt.
- Parameters:
prompt – The original prompt containing sensitive information.
entity_map_param – Optional existing entity map to maintain consistency.
- Returns:
Tuple of (cloaked_prompt, entity_mapping)
- classmethod disable_contacts(start_delimiter: str = '<', end_delimiter: str = '>', llm_func: Callable[[str], str] | Callable[[str], Generator[str, None, None]] | None = None, max_cache_size: int = 1000) LLMShield
Create LLMShield with contact information disabled.
Disables: EMAIL, PHONE detection.
- classmethod disable_locations(start_delimiter: str = '<', end_delimiter: str = '>', llm_func: Callable[[str], str] | Callable[[str], Generator[str, None, None]] | None = None, max_cache_size: int = 1000) LLMShield
Create LLMShield with location-based entities disabled.
Disables: PLACE, IP_ADDRESS, URL detection.
- classmethod disable_persons(start_delimiter: str = '<', end_delimiter: str = '>', llm_func: Callable[[str], str] | Callable[[str], Generator[str, None, None]] | None = None, max_cache_size: int = 1000) LLMShield
Create LLMShield with person entities disabled.
Disables: PERSON detection.
- classmethod only_financial(start_delimiter: str = '<', end_delimiter: str = '>', llm_func: Callable[[str], str] | Callable[[str], Generator[str, None, None]] | None = None, max_cache_size: int = 1000) LLMShield
Create LLMShield with only financial entities enabled.
Enables: CREDIT_CARD detection only.
- stream_uncloak(response_stream: Generator[str, None, None], entity_map: dict[str, str] | None = None) Generator[str, None, None]
Restore original entities in streaming LLM responses.
The function processes the response stream in the form of chunks, attempting to yield either uncloaked chunks or the remaining buffer content in which there was no uncloaking done yet.
For non-stream responses, use the uncloak method instead.
- Limitations:
Only supports a response from a single LLM function call.
- Parameters:
response_stream – Iterator yielding cloaked LLM response chunks
entity_map – Mapping of placeholders to original values. By default, it is None, which means it will use the last cloak call’s entity map.
- Yields:
str – Uncloaked response chunks
- uncloak(response: str | list[Any] | dict[str, Any] | PydanticLike, entity_map: dict[str, str] | None = None) str | list[Any] | dict[str, Any] | PydanticLike
Restore original entities in the LLM response.
It supports strings and structured outputs consisting of any combination of strings, lists, and dictionaries.
For uncloaking stream responses, use the stream_uncloak method instead.
- Parameters:
response – The LLM response containing placeholders. Supports both strings and structured outputs (dicts).
entity_map – Mapping of placeholders to original values (if empty, uses mapping from last cloak call)
- Returns:
Response with original entities restored
- Raises:
TypeError – If response parameters of invalid type.
ValueError – If no entity mapping is provided and no previous cloak call.
llmshield.entity_detector module
Entity detection and classification module.
- Description:
This module implements comprehensive entity detection algorithms to identify personally identifiable information (PII) and sensitive data in text. It uses a multi-layered approach combining regex patterns, dictionary lookups, and contextual analysis to accurately detect various entity types.
- Classes:
EntityDetector: Main class for detecting entities in text Entity: Data class representing a detected entity EntityType: Enumeration of supported entity types EntityGroup: Grouping of entity types into categories EntityConfig: Configuration for selective entity detection
- Detection Methods:
Regex patterns for structured data (emails, URLs, phone numbers)
Dictionary lookups for known entities (cities, countries, organisations)
Contextual analysis for proper nouns and person names
Heuristic rules for complex entity patterns
- Author:
LLMShield by brainpolo, 2025
- class llmshield.entity_detector.Entity(type: EntityType, value: str)
Bases:
object
Represents a detected entity in text.
- property group: EntityGroup
Get the group this entity belongs to.
- type: EntityType
- value: str
- class llmshield.entity_detector.EntityConfig(enabled_types: frozenset[EntityType] | None = None)
Bases:
object
Configuration for selective entity detection and cloaking.
- classmethod disable_contacts() EntityConfig
Create config with contact information disabled.
- classmethod disable_locations() EntityConfig
Create config with location-based entities disabled.
- classmethod disable_persons() EntityConfig
Create config with person entities disabled.
- is_enabled(entity_type: EntityType) bool
Check if an entity type is enabled for detection.
- classmethod only_financial() EntityConfig
Create config with only financial entities enabled.
- with_disabled(*disabled_types: EntityType) EntityConfig
Create new config with specified types disabled.
- with_enabled(*enabled_types: EntityType) EntityConfig
Create new config with only specified types enabled.
- class llmshield.entity_detector.EntityDetector(config: EntityConfig | None = None)
Bases:
object
Main entity detection system using rule-based and pattern approaches.
Identifies sensitive information in text using a waterfall approach where each detection method is tried in order, and the text is reduced as each entity is found. This eliminates potential overlapping entities and improves detection accuracy.
- class llmshield.entity_detector.EntityGroup(*values)
Bases:
str
,Enum
Groups of related entity types.
- LOCATOR = 'LOCATOR'
- NUMBER = 'NUMBER'
- PNOUN = 'PNOUN'
- get_types() set[EntityType]
Get all entity types belonging to this group.
- class llmshield.entity_detector.EntityType(*values)
Bases:
str
,Enum
Primary classification of entity types.
- CONCEPT = 'CONCEPT'
- CREDIT_CARD = 'CREDIT_CARD'
- EMAIL = 'EMAIL'
- IP_ADDRESS = 'IP_ADDRESS'
- ORGANISATION = 'ORGANISATION'
- PERSON = 'PERSON'
- PHONE = 'PHONE'
- PHONE_NUMBER = 'PHONE'
- PLACE = 'PLACE'
- URL = 'URL'
- classmethod all() frozenset[EntityType]
Return frozenset of all entity types.
- classmethod locators() frozenset[EntityType]
Return entity types that are location-based identifiers.
- classmethod numbers() frozenset[EntityType]
Return entity types that are numeric identifiers.
- classmethod proper_nouns() frozenset[EntityType]
Return entity types that are proper nouns.
llmshield.uncloak_response module
Response uncloaking module.
- Description:
This module handles the restoration of original sensitive data in LLM responses by replacing placeholders with their original values. It supports various response formats including strings, lists, dictionaries, and Pydantic models.
- Functions:
uncloak_response: Restore original entities in LLM response
Note
This module is intended for internal use only. Users should interact with the LLMShield class rather than calling these functions directly.
- Author:
LLMShield by brainpolo, 2025
llmshield.utils module
Utility functions and type definitions.
- Description:
This module provides common utility functions, type definitions, and helper protocols used throughout the LLMShield library. It includes validation functions, text processing utilities, and protocol definitions for type safety.
- Protocols:
PydanticLike: Protocol for Pydantic-compatible objects
- Functions:
split_fragments: Split text into processable fragments is_valid_delimiter: Validate delimiter strings wrap_entity: Create placeholder strings for entities normalise_spaces: Normalize whitespace in text is_valid_stream_response: Check if response is streamable conversation_hash: Generate hash for conversation caching ask_helper: Internal helper for LLM ask operations
- Author:
LLMShield by brainpolo, 2025
- class llmshield.utils.PydanticLike(*args, **kwargs)
Bases:
Protocol
A protocol for types that behave like Pydantic models.
This is to provide type-safety for the uncloak function, which can accept either a string, list, dict, or a Pydantic model for LLM responses which return structured outputs.
NOTE: This is not essential for the library, but it is used to provide type-safety for the uncloak function.
Pydantic models have the following methods: - model_dump() -> dict - model_validate(data: dict) -> Any
- model_dump() dict
Convert the model to a dictionary.
- classmethod model_validate(data: dict) Any
Create a model instance from a dictionary.
- llmshield.utils.ask_helper(shield, stream: bool, **kwargs) str | Generator[str, None, None]
Handle the ask method of LLMShield.
This function checks if the input should be cloaked and handles both streaming and non-streaming cases using the provider system.
- Parameters:
shield – The LLMShield instance.
stream – Whether to stream the response.
**kwargs – Additional keyword arguments to pass to the LLM function.
- Returns:
The response from the LLM.
- Return type:
str | Generator[str, None, None]
- llmshield.utils.conversation_hash(obj: dict[str, str] | list[dict[str, str]]) int
Generate a stable, hashable key for a message or a list of messages.
If a single message is provided, hash its role and content. If a list of messages is provided, hash the set of (role, content) pairs.
- llmshield.utils.is_valid_delimiter(delimiter: str) bool
Validate a delimiter based on the following rules.
Must be a string.
Must be at least 1 character long.
- Parameters:
delimiter – The delimiter to validate.
- Returns:
True if the delimiter is valid, False otherwise.
- llmshield.utils.is_valid_stream_response(obj: object) bool
Check if obj is an iterable suitable for streaming.
- Parameters:
obj – The object to check.
- Returns:
True if obj is an iterable suitable for streaming, False otherwise.
- llmshield.utils.normalise_spaces(text: str) str
Normalise spaces by replacing multiple spaces with single space.
- llmshield.utils.split_fragments(text: str) list[str]
Split the text into fragments based on the following rules.
Split on sentence boundaries (punctuation / new line)
Remove any empty fragments.
- Parameters:
text – The text to split.
- Returns:
A list of fragments.
- llmshield.utils.wrap_entity(entity_type: EntityType, suffix: int, start_delimiter: str, end_delimiter: str) str
Wrap an entity in a start and end delimiter.
The wrapper works as follows: - The value will be wrapped with START_DELIMETER and END_DELIMETER. - The suffix will be appended to the entity.
- Parameters:
entity_type – The entity to wrap.
suffix – The suffix to append to the entity.
start_delimiter – The start delimiter.
end_delimiter – The end delimiter.
- Returns:
The wrapped entity.
Module contents
Zero-dependency PII protection for LLM applications.
- Description:
llmshield is a lightweight Python library that automatically detects and protects personally identifiable information (PII) in prompts sent to language models. It replaces sensitive data with placeholders before processing and seamlessly restores the original information in responses.
- Classes:
LLMShield: Main interface for prompt cloaking and response uncloaking EntityConfig: Configuration for selective entity detection EntityType: Enumeration of supported entity types
- Functions:
create_shield: Factory function to create configured LLMShield instances
Examples
Basic usage: >>> from llmshield import ( … LLMShield, … ) >>> shield = LLMShield() >>> ( … safe_prompt, … entities, … ) = shield.cloak( … “Hi, I’m John (john@example.com)” … ) >>> response = shield.uncloak( … llm_response, … entities, … )
Direct usage with LLM: >>> def my_llm( … prompt: str, … ) -> str: … # Your LLM API call here … return response >>> shield = LLMShield( … llm_func=my_llm … ) >>> response = shield.ask( … prompt=”Hi, I’m John (john@example.com)” … )
- Author:
LLMShield by brainpolo, 2025
- class llmshield.EntityConfig(enabled_types: frozenset[EntityType] | None = None)
Bases:
object
Configuration for selective entity detection and cloaking.
- classmethod disable_contacts() EntityConfig
Create config with contact information disabled.
- classmethod disable_locations() EntityConfig
Create config with location-based entities disabled.
- classmethod disable_persons() EntityConfig
Create config with person entities disabled.
- is_enabled(entity_type: EntityType) bool
Check if an entity type is enabled for detection.
- classmethod only_financial() EntityConfig
Create config with only financial entities enabled.
- with_disabled(*disabled_types: EntityType) EntityConfig
Create new config with specified types disabled.
- with_enabled(*enabled_types: EntityType) EntityConfig
Create new config with only specified types enabled.
- class llmshield.EntityType(*values)
Bases:
str
,Enum
Primary classification of entity types.
- CONCEPT = 'CONCEPT'
- CREDIT_CARD = 'CREDIT_CARD'
- EMAIL = 'EMAIL'
- IP_ADDRESS = 'IP_ADDRESS'
- ORGANISATION = 'ORGANISATION'
- PERSON = 'PERSON'
- PHONE = 'PHONE'
- PHONE_NUMBER = 'PHONE'
- PLACE = 'PLACE'
- URL = 'URL'
- classmethod all() frozenset[EntityType]
Return frozenset of all entity types.
- classmethod locators() frozenset[EntityType]
Return entity types that are location-based identifiers.
- classmethod numbers() frozenset[EntityType]
Return entity types that are numeric identifiers.
- classmethod proper_nouns() frozenset[EntityType]
Return entity types that are proper nouns.
- class llmshield.LLMShield(start_delimiter: str = '<', end_delimiter: str = '>', llm_func: Callable[[str], str] | Callable[[str], Generator[str, None, None]] | None = None, max_cache_size: int = 1000, entity_config: EntityConfig | None = None)
Bases:
object
Main class for LLMShield protecting sensitive information in LLMs.
Example
>>> from llmshield import ( ... LLMShield, ... ) >>> shield = LLMShield() >>> ( ... cloaked_prompt, ... entity_map, ... ) = shield.cloak( ... "Hi, I'm John Doe (john.doe@example.com)" ... ) >>> print( ... cloaked_prompt ... ) "Hi, I'm <PERSON_0> (<EMAIL_1>)" >>> llm_response = get_llm_response( ... cloaked_prompt ... ) # Your LLM call >>> original = shield.uncloak( ... llm_response, ... entity_map, ... )
- ask(stream: bool = False, messages: list[dict[str, str]] | None = None, **kwargs) str | Generator[str, None, None]
Complete end-to-end LLM interaction with automatic protection.
NOTE: If you are using a structured output, ensure that your keys do not contain PII and that any keys that may contain PII are either string, lists, or dicts. Other types like int, float, are unable to be cloaked and will be returned as is.
- Parameters:
prompt/message – Original prompt with sensitive information. This will be cloaked and passed to your LLM function. Do not pass both, and do not use any other parameter names as they are unrecognised by the shield.
stream – Whether the LLM Function is a stream or not. If True, returns a generator that yields incremental responses following the OpenAI Realtime Streaming API. If False, returns the complete response as a string. By default, this is False.
messages – List of message dictionaries for multi-turn conversations.
dictionaries (They must come in the form of a list of)
:param : :param where each dictionary has keys like “role” and “content”.: :param **kwargs: Additional arguments to pass to your LLM function,
such as: - model: The model to use (e.g., “gpt-4”) - system_prompt: System instructions - temperature: Sampling temperature - max_tokens: Maximum tokens in response etc.
! The arguments do not have to be in any specific order!
- Returns:
Uncloaked LLM response with original entities restored.
Generator[str, None, None]: If stream is True, returns a generator that yields incremental responses, following the OpenAI Realtime Streaming API.
- Return type:
str
! Regardless of the specific implementation of the LLM Function, whenever the stream parameter is true, the function will return an generator. !
- Raises:
ValueError – If no LLM function was provided during initialization, if prompt is invalid, or if both prompt and message are provided
- cloak(prompt: str | None, entity_map_param: dict[str, str] | None = None) tuple[str | None, dict[str, str]]
Cloak sensitive information in the prompt.
- Parameters:
prompt – The original prompt containing sensitive information.
entity_map_param – Optional existing entity map to maintain consistency.
- Returns:
Tuple of (cloaked_prompt, entity_mapping)
- classmethod disable_contacts(start_delimiter: str = '<', end_delimiter: str = '>', llm_func: Callable[[str], str] | Callable[[str], Generator[str, None, None]] | None = None, max_cache_size: int = 1000) LLMShield
Create LLMShield with contact information disabled.
Disables: EMAIL, PHONE detection.
- classmethod disable_locations(start_delimiter: str = '<', end_delimiter: str = '>', llm_func: Callable[[str], str] | Callable[[str], Generator[str, None, None]] | None = None, max_cache_size: int = 1000) LLMShield
Create LLMShield with location-based entities disabled.
Disables: PLACE, IP_ADDRESS, URL detection.
- classmethod disable_persons(start_delimiter: str = '<', end_delimiter: str = '>', llm_func: Callable[[str], str] | Callable[[str], Generator[str, None, None]] | None = None, max_cache_size: int = 1000) LLMShield
Create LLMShield with person entities disabled.
Disables: PERSON detection.
- classmethod only_financial(start_delimiter: str = '<', end_delimiter: str = '>', llm_func: Callable[[str], str] | Callable[[str], Generator[str, None, None]] | None = None, max_cache_size: int = 1000) LLMShield
Create LLMShield with only financial entities enabled.
Enables: CREDIT_CARD detection only.
- stream_uncloak(response_stream: Generator[str, None, None], entity_map: dict[str, str] | None = None) Generator[str, None, None]
Restore original entities in streaming LLM responses.
The function processes the response stream in the form of chunks, attempting to yield either uncloaked chunks or the remaining buffer content in which there was no uncloaking done yet.
For non-stream responses, use the uncloak method instead.
- Limitations:
Only supports a response from a single LLM function call.
- Parameters:
response_stream – Iterator yielding cloaked LLM response chunks
entity_map – Mapping of placeholders to original values. By default, it is None, which means it will use the last cloak call’s entity map.
- Yields:
str – Uncloaked response chunks
- uncloak(response: str | list[Any] | dict[str, Any] | PydanticLike, entity_map: dict[str, str] | None = None) str | list[Any] | dict[str, Any] | PydanticLike
Restore original entities in the LLM response.
It supports strings and structured outputs consisting of any combination of strings, lists, and dictionaries.
For uncloaking stream responses, use the stream_uncloak method instead.
- Parameters:
response – The LLM response containing placeholders. Supports both strings and structured outputs (dicts).
entity_map – Mapping of placeholders to original values (if empty, uses mapping from last cloak call)
- Returns:
Response with original entities restored
- Raises:
TypeError – If response parameters of invalid type.
ValueError – If no entity mapping is provided and no previous cloak call.