Description
The type_of_generic() method in the InQuest analyzer has a TODO comment at line 49 indicating that the validation should be more thorough:
https://github.com/intelowlproject/IntelOwl/blob/master/api_app/analyzers_manager/observable_analyzers/inquest.py#L45-L51
def type_of_generic(self):
if re.match(r"^[\w\.\+\-]+\@[\w]+\.[a-z]{2,3}$", self.observable_name):
type_ = "email"
else:
# TODO: This should be validated more thoroughly
type_ = "filename"
return type_
Current Issues
-
Weak email regex: The pattern doesn't handle:
- TLDs longer than 3 characters (
.info, .museum, .technology)
- Subdomains (
user@sub.domain.com)
-
No detection for other supported types: The InQuest API supports registry and xmpid types (as seen in line 83), but these are never detected automatically
-
Everything defaults to filename: Any non-email observable is assumed to be a filename without validation
Proposed Solution
Improve the type_of_generic() method to:
- Use a more comprehensive email regex pattern
- Add detection for Windows registry keys (e.g.,
HKEY_LOCAL_MACHINE\...)
- Add detection for XMP IDs (UUID-like patterns)
- Add basic filename validation
- Log a warning for unrecognized patterns
Example Implementation
def type_of_generic(self):
"""Determine the type of a generic observable."""
# Email pattern - more comprehensive
email_pattern = r"^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$"
if re.match(email_pattern, self.observable_name):
return "email"
# Windows Registry key pattern
registry_pattern = r"^(HKEY_|HK[A-Z]{2,})"
if re.match(registry_pattern, self.observable_name, re.IGNORECASE):
return "registry"
# XMP ID pattern (UUID-like)
xmpid_pattern = r"^[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}$"
if re.match(xmpid_pattern, self.observable_name):
return "xmpid"
# Filename pattern - basic validation
filename_pattern = r"^[\w\-. ]+\.[a-zA-Z0-9]{1,10}$"
if re.match(filename_pattern, self.observable_name):
return "filename"
# Default with warning
logger.warning(
f"Could not determine type of generic observable: {self.observable_name}. "
"Defaulting to 'filename'."
)
return "filename"
I am interested in working on this issue if approved. And tell me any required changes if needed.
Description
The
type_of_generic()method in the InQuest analyzer has a TODO comment at line 49 indicating that the validation should be more thorough:https://github.com/intelowlproject/IntelOwl/blob/master/api_app/analyzers_manager/observable_analyzers/inquest.py#L45-L51
Current Issues
Weak email regex: The pattern doesn't handle:
.info,.museum,.technology)user@sub.domain.com)No detection for other supported types: The InQuest API supports
registryandxmpidtypes (as seen in line 83), but these are never detected automaticallyEverything defaults to filename: Any non-email observable is assumed to be a filename without validation
Proposed Solution
Improve the
type_of_generic()method to:HKEY_LOCAL_MACHINE\...)Example Implementation
I am interested in working on this issue if approved. And tell me any required changes if needed.