How can I parse XML that confirms to the 1.1 spec using Java and Xerces?(如何使用 Java 和 Xerces 解析符合 1.1 规范的 XML?)
问题描述
我正在尝试解析包含符合 XML 1.1 规范的 XML 内容的字符串一个>.XML 包含在 XML 1.0 规范中不允许但在 XML 1.1 规范中允许的字符引用(字符引用转换为 U+0001–U+001F 范围内的 Unicode 字符).
I'm trying to parse a String which contains XML content which conforms to the XML 1.1 spec. The XML contains character references which are not allowed in the XML 1.0 spec but which are allowed in the XML 1.1 spec (character references which translate to Unicode characters in the range U+0001–U+001F).
根据 Xerces2 网站,Xerces2 解析器支持解析 XML 1.1 文档.但是,我不知道如何告诉它我们尝试解析的 XML 包含符合 1.1 的 XML.
According the Xerces2 website, the Xerces2 parser supports parsing XML 1.1 documents. However, I cannot figure out how to tell it the XML we are trying to parse contains 1.1-compliant XML.
我正在使用 DocumentBuilder 来解析 XML(类似这样):
I'm using a DocumentBuilder to parse the XML (something like this):
public Element parseString(String xmlString) {
try {
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder documentBuilder = dbf.newDocumentBuilder();
InputSource source = new InputSource(new StringReader(xmlString));
// Throws org.xml.sax.SAXParseException becuase of the invalid character refs
Document doc = documentBuilder.parse(source);
return doc.getDocumentElement();
} catch (ParserConfigurationException pce) {
// Handle the error
} catch (SAXException se) {
// Handle the error
} catch (IOException ioe) {
// Handle the error
}
}
我已尝试设置 XML 标头以指示 XML 符合 1.1 规范...
I've tried setting the XML header to indicate the XML conforms to the 1.1 spec...
xmlString = "<?xml version="1.1" encoding="UTF-8" ?>" + xmlString;
...但仍被解析为 1.0 XML(仍会生成无效字符引用异常).
...but it is still parsed as 1.0 XML (still generates the invalid character reference exceptions).
如何配置 Xerces 解析器以将 XML 解析为 XML 1.1?是否有其他解析器可以为 XML 1.1 提供更好的支持?
How can I configure the Xerces parser to parse the XML as XML 1.1? Is there an alternative parser which provides better support for XML 1.1?
推荐答案
看这里 查看 xerces 支持的所有功能的列表.可能低于 2 个功能是您必须打开的.
See here for a list of all the features supported by xerces. May be below 2 features is what you have to turn on.
http://xml.org/sax/features/unicode-normalization-checking
True:执行 Unicode 规范化检查(如 XML 1.1 建议的第 2.13 节和附录 B 中所述)并报告规范化错误.
True: Perform Unicode normalization checking (as described in section 2.13 and Appendix B of the XML 1.1 Recommendation) and report normalization errors.
False:不报告 Unicode 规范化错误.
False: Do not report Unicode normalization errors.
http://xml.org/sax/features/xml-1.1
正确:解析器同时支持 XML 1.0 和 XML 1.1.
False:解析器仅支持 XML 1.0.
访问:只读自:Xerces-J 2.7.0注意:此功能的价值取决于 SAX 解析器拥有的解析器配置是否已知支持 XML 1.1.
True: The parser supports both XML 1.0 and XML 1.1.
False: The parser supports only XML 1.0.
Access: read-only
Since: Xerces-J 2.7.0
Note: The value of this feature will depend on whether the parser configuration owned by the SAX parser is known to support XML 1.1.
这篇关于如何使用 Java 和 Xerces 解析符合 1.1 规范的 XML?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:如何使用 Java 和 Xerces 解析符合 1.1 规范的 XML?
- 未找到/usr/local/lib 中的库 2022-01-01
- java.lang.IllegalStateException:Bean 名称“类别"的 BindingResult 和普通目标对象都不能用作请求属性 2022-01-01
- 获取数字的最后一位 2022-01-01
- 转换 ldap 日期 2022-01-01
- GC_FOR_ALLOC 是否更“严重"?在调查内存使用情况时? 2022-01-01
- 将 Java Swing 桌面应用程序国际化的最佳实践是什么? 2022-01-01
- 如何指定 CORS 的响应标头? 2022-01-01
- Eclipse 的最佳 XML 编辑器 2022-01-01
- 在 Java 中,如何将 String 转换为 char 或将 char 转换 2022-01-01
- 如何使 JFrame 背景和 JPanel 透明且仅显示图像 2022-01-01