一个简单的问题,我只是想确定一下。

Google网站地图生成器生成了一个sitemap.txt文件,其中包含如下所示的链接:

http://www.domain.com/category.htm?name=some-name&cat_id=8


在这些链接中使用&代替&是否正确,或者仅仅是站点地图生成器所犯的错误?

谢谢。

#1 楼

那是对的。它是与号(&)的HTML实体,并且是在正确编码的URL中的正确字符表示。和号(&)以及<>是XML和HTML中的特殊字符,需要使用其特殊字符实体进行显示。

评论


Are you sure all ampersands have to be entity-escaped? I thought that was only for those separating parameters in query strings. I've always escaped ampersands within file or folder names or in the parameters themselves using percent-encoding (%26): e.g. http://foo/a%26r.php?foo=1&genre=r%26b

– Lèse majesté
Oct 21 '10 at 19:37



我不确定百分比编码是否有效,因此无法确定。

–John Conde♦
2010-10-21 19:48



@Lèse-因为它是XML文档,所以必须转义它,除非您使用CDATA节点(只是注意到bdadam说了同样的话,但比我早得多)

–马克·亨德森(Mark Henderson)
2010-10-26 20:30



>不必严格要求使用XML编码实体。

–怀特先生
2014-02-20 12:57



#2 楼

您的站点地图文件必须为UTF-8编码(保存文件时通常可以这样做)。与所有XML文件一样,任何数据值(包括URL)都必须使用实体转义码作为字符。

这可能会有所帮助,http://sitemaps.org/protocol.php

评论


不幸的是,此链接现在已失效。

–mtness
18-11-14在12:42

#3 楼

您还可以通过查看


此站点地图和&文章
官方XML站点地图协议页

来说服自己。 xml sitemaps协议页面:)

#4 楼

Google rejects the sitemap as broken if it has a & character in an URL. It accepts it when you replace & with &

BUT: if you later check the list of crawling errors in the Google webmasters tool, it will report this URL of the sitemap file as broken, because it contains & instead of &.

Thus the correct solution is to change the URL such that it does not contain &. Or report this as bug to Google.

#5 楼

URL-Encoding and XML entity encoding are not the same things.
You need URL-Encoding to replace special characters in URLs, such as & which can only be used for the separation of query parameters.
XML entity encoding is for encoding special characters in XML (also XHTML). This means, if you have a URL in an XML (or XHTML) file, and this URL includes some & characters, you have to entity encode it to &. So in a sitemap.xml you will have urls like in the question from Marco Demaio.