Given a string of HTML attributes and values, parse into a structured attribute list.
Description
This function performs a number of transformations while parsing attribute strings:
- It normalizes attribute values and surrounds them with double quotes.
- It normalizes HTML character references inside attribute values.
- It removes “bad” URL protocols from attribute values.
Otherwise this reads the attributes as if they were part of an HTML tag. It performs these transformations to lower the risk of mis-parsing down the line and to perform URL sanitization in line with the rest of the kses subsystem. Importantly, it does not decode the attribute values, meaning that special HTML syntax characters will be left with character references in the value property.
Example:
$attrs = wp_kses_hair( 'class="is-wide" inert data-lazy=\'<img>\' =/🐮=/' );
$attrs === array(
'class' => array( 'name' => 'class', 'value' => 'is-wide', 'whole' => 'class="is-wide"', 'vless' => 'n' ),
'inert' => array( 'name' => 'inert', 'value' => '', 'whole' => 'inert', 'vless' => 'y' ),
'data-lazy' => array( 'name' => 'data-lazy', 'value' => '<img>', 'whole' => 'data-lazy="<img>"', 'vless' => 'n' ),
'=' => array( 'name' => '=', 'value' => '', 'whole' => '=', 'vless' => 'y' ),
'🐮' => array( 'name' => '🐮', 'value' => '/', 'whole' => '🐮="/"', 'vless' => 'n' ),
);Parameters
$attrstringrequired- Attribute list from HTML element to closing HTML element tag.
$allowed_protocolsstring[]required- Array of allowed URL protocols.
Source
function wp_kses_hair( $attr, $allowed_protocols ) {
$attributes = array();
$uris = wp_kses_uri_attributes();
$processor = new WP_HTML_Tag_Processor( "<wp {$attr}>" );
$processor->next_token();
$attribute_names = $processor->get_attribute_names_with_prefix( '' );
if ( null === $attribute_names || 0 === count( $attribute_names ) ) {
return $attributes;
}
$syntax_characters = array(
'&' => '&',
'<' => '<',
'>' => '>',
"'" => ''',
'"' => '"',
);
foreach ( $attribute_names as $name ) {
$value = $processor->get_attribute( $name );
$is_bool = true === $value;
if ( is_string( $value ) && in_array( $name, $uris, true ) ) {
$value = wp_kses_bad_protocol( $value, $allowed_protocols );
}
// Reconstruct and normalize the attribute value.
$recoded = $is_bool ? '' : strtr( $value, $syntax_characters );
$whole = $is_bool ? $name : "{$name}=\"{$recoded}\"";
$attributes[ $name ] = array(
'name' => $name,
'value' => $recoded,
'whole' => $whole,
'vless' => $is_bool ? 'y' : 'n',
);
}
return $attributes;
}
User Contributed Notes
You must log in before being able to contribute a note or feedback.