java.lang.Object
org.apache.jena.riot.web.LangTag
Language tags: support for parsing and canonicalization of case.
Grandfathered forms ("i-") are left untouched. Unsupported or syntactically
illegal forms are handled in canonicalization by doing nothing.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final int
Index of all extensionsstatic final int
Index of the language partstatic final int
Index of the region partstatic final int
Index of the script partstatic final int
Index of the variant part -
Method Summary
Modifier and TypeMethodDescriptionstatic String
Canonicalize with the rules of RFC 4646, or RFC5646 without replacement of preferred form.static String
Canonicalize with the rules of RFC 4646 "In this format, all non-initial two-letter subtags are uppercase, all non-initial four-letter subtags are titlecase, and all other subtags are lowercase."static boolean
Validate - basic syntax check for a language tags: [a-zA-Z]+ ('-'[a-zA-Z0-9]+)*static String[]
Parse a langtag string and return it's parts in canonical case.
-
Field Details
-
idxLanguage
public static final int idxLanguageIndex of the language part- See Also:
-
idxScript
public static final int idxScriptIndex of the script part- See Also:
-
idxRegion
public static final int idxRegionIndex of the region part- See Also:
-
idxVariant
public static final int idxVariantIndex of the variant part- See Also:
-
idxExtension
public static final int idxExtensionIndex of all extensions- See Also:
-
-
Method Details
-
check
Validate - basic syntax check for a language tags: [a-zA-Z]+ ('-'[a-zA-Z0-9]+)* -
parse
Parse a langtag string and return it's parts in canonical case. See constants for the array contents. Parts not present cause a null in the return array.- Returns:
- Langtag parts, or null if the input string does not parse as a lang tag.
-
canonical
Canonicalize with the rules of RFC 4646, or RFC5646 without replacement of preferred form. -
canonical
Canonicalize with the rules of RFC 4646 "In this format, all non-initial two-letter subtags are uppercase, all non-initial four-letter subtags are titlecase, and all other subtags are lowercase." In addition, leave extensions unchanged.This is the same as RFC5646 without replacement of preferred form or consulting the registry.
-