如何在Java中进行URL解码？

在Java中，我想转换为：

https%3A%2F%2Fmywebsite%2Fdocs%2Fenglish%2Fsite%2Fmybook.do%3Frequest_type

对此：

https://mywebsite/docs/english/site/mybook.do&request_type

这就是我的意思到目前为止：

class StringUTF 
{
    public static void main(String[] args) 
    {
        try{
            String url = 
               "https%3A%2F%2Fmywebsite%2Fdocs%2Fenglish%2Fsite%2Fmybook.do" +
               "%3Frequest_type%3D%26type%3Dprivate";

            System.out.println(url+"Hello World!------->" +
                new String(url.getBytes("UTF-8"),"ASCII"));
        }
        catch(Exception E){
        }
    }
}

但是它不能正常工作。这些%3A和%2F格式分别是什么？如何转换它们？

@Stephen ..为什么url不能为UTF-8编码的String ..？

问题在于，仅因为URL可以是UTF-8，该问题实际上与UTF-8无关。我已经适当地编辑了问题。

从理论上讲可能是，但是示例中的字符串不是UTF-8编码的字符串。这是一个URL编码的ASCII字符串。因此，标题具有误导性。

还值得注意的是，URL字符串中的所有字符均为ASCII，并且在对字符串进行URL解码后也是如此。如果xx小于（十六进制）80，则'％'是ASCII字符，％xx表示ASCII字符。

#1 楼

这与字符编码（例如UTF-8或ASCII）无关。您所拥有的字符串是经过URL编码的。这种编码与字符编码完全不同。

请尝试如下操作：

try {
    String result = java.net.URLDecoder.decode(url, StandardCharsets.UTF_8.name());
} catch (UnsupportedEncodingException e) {
    // not going to happen - value came from JDK's own StandardCharsets
}

Java 10为Charset添加了直接支持API，这意味着无需捕获UnsupportedEncodingException：

String result = java.net.URLDecoder.decode(url, StandardCharsets.UTF_8);

请注意，字符编码（例如UTF-8或ASCII）是确定字符映射到原始字节。有关字符编码的良好介绍，请参见本文。

URLDecoder上的方法是静态的，因此您不必为其创建新实例。

– laz
2011年5月26日12:37

@Trismegistos根据Java 7 API文档，仅弃用了不指定字符编码的版本（第二个参数“ UTF-8”）。使用带有两个参数的版本。

–杰斯珀
2012-12-19 15:47

如果使用Java 1.7+，则可以使用此包的静态版本：“ UTF-8”字符串：StandardCharsets.UTF_8.name（）：java.nio.charset.StandardCharsets。与此相关：链接

– Shahar
2014年4月30日12:46

对于字符编码，这也使一篇很棒的文章balusc.blogspot.in/2009/05/unicode-how-to-get-characters-right.html

–crackerplace
14年7月16日在20:32

请注意这一点。如此处所述：blog.lunatech.com/2009/02/03/…这与URL无关，但与HTML表单编码有关。

– Michal
15年5月27日在12:29

#2 楼

您获得的字符串采用application/x-www-form-urlencoded编码。

使用URLDecoder将其转换为Java字符串。

URLDecoder.decode( url, "UTF-8" );

#3 楼

这已经被回答过了（尽管这个问题是第一个！）：

“您应该使用java.net.URI来执行此操作，因为URLDecoder类确实执行x-www-form- urlencoded解码是错误的（尽管名称，它用于表单数据）。“

URL类文档指出：

推荐的方法管理URL的编码和解码是
使用URI，并使用toURI（）和
URI.toURL（）在这两个类之间进行转换。

URLEncoder和URLDecoder类也可以使用，但仅用于
HTML表单编码，这与RFC2396中定义的编码方案
不同。

基本上：

String url = "https%3A%2F%2Fmywebsite%2Fdocs%2Fenglish%2Fsite%2Fmybook.do%3Frequest_type";
System.out.println(new java.net.URI(url).getPath());

会给你：

https://mywebsite/docs/english/site/mybook.do?request_type

在Java 1.7中，不建议使用URLDecoder.decode（String，String）重载。您必须引用没有编码的URLDecoder.decode（String）重载。您可能需要更新您的帖子以进行澄清。

–亚伦
14年8月18日在18:31

这个答案是误导的。该块引用与弃用无关。不推荐使用的方法的Javadoc声明了，我实际上引用了@deprecated。结果字符串可能会有所不同，具体取决于平台的默认编码。而是使用解码（String，String）方法来指定编码。

–艾默生·法鲁吉亚（Emerson Farrugia）
2015年4月1日在10:30

URI的getPath（）仅返回URI的路径部分，如上所述。

–Pelpotronic
16年7月25日在20:33

除非我弄错了，否则已知“路径”是URI中授权部分之后的那部分（有关路径的定义，请参见：en.wikipedia.org/wiki/Uniform_Resource_Identifier）-在我看来，我所看到的行为是是标准/正确的行为。我正在使用Java 1.8.0_101（在Android Studio上）。我很想知道调用“ getAuthority（）”后会得到什么。即使本文/示例似乎也表明路径仅是其URI的/ public / manual / appliances部分：quepublishing.com/articles/article.aspx？p = 26566＆seqNum = 3

–Pelpotronic
16年7月27日在18:58

@Pelpotronic帖子中的代码实际上确实打印了它显示的输出（至少对我而言）。我认为其原因在于，由于URL编码，URI构造函数实际上将整个字符串（https％3A％2F ...）视为URI的路径；没有权限或查询等。可以通过在URI对象上调用相应的get方法来进行测试。如果将解码后的文本传递给URI构造函数：new URI（“ https：// mywebsite / do .....”），则调用getPath（）和其他方法将给出正确的结果。

–克鲁
19年6月2日在2:26

#4 楼

%3A和%2F是URL编码的字符。使用此Java代码将它们转换回:和/

String decoded = java.net.URLDecoder.decode(url, "UTF-8");

它也不会转换％2C，它是（，）

–vuhung3990
15年1月6日在18:45

这需要包装在try / catch块中。.阅读更多有关已检查的异常（此）与未检查的stackoverflow.com/questions/6115896/…的信息。

–TheNurb
16年7月26日在20:52

#5 楼

 try {
        String result = URLDecoder.decode(urlString, "UTF-8");
    } catch (UnsupportedEncodingException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

#6 楼

public String decodeString(String URL)
    {

    String urlString="";
    try {
        urlString = URLDecoder.decode(URL,"UTF-8");
        } catch (UnsupportedEncodingException e) {
            // TODO Auto-generated catch block

        }

        return urlString;

    }

您能否详细说明您的答案，并提供有关您提供的解决方案的更多说明？

–巴西立酮
15年6月16日在7:22

#7 楼

我使用的是Apache Commons

String decodedUrl = new URLCodec().decode(url);

默认字符集是UTF-8

#8 楼

import java.io.UnsupportedEncodingException;
import java.net.URISyntaxException;

public class URLDecoding { 

    String decoded = "";

    public String decodeMethod(String url) throws UnsupportedEncodingException
    {
        decoded = java.net.URLDecoder.decode(url, "UTF-8"); 
        return  decoded;
//"You should use java.net.URI to do this, as the URLDecoder class does x-www-form-urlencoded decoding which is wrong (despite the name, it's for form data)."
    }

    public String getPathMethod(String url) throws URISyntaxException 
    {
        decoded = new java.net.URI(url).getPath();  
        return  decoded; 
    }

    public static void main(String[] args) throws UnsupportedEncodingException, URISyntaxException 
    {
        System.out.println(" Here is your Decoded url with decode method : "+ new URLDecoding().decodeMethod("https%3A%2F%2Fmywebsite%2Fdocs%2Fenglish%2Fsite%2Fmybook.do%3Frequest_type")); 
        System.out.println("Here is your Decoded url with getPath method : "+ new URLDecoding().getPathMethod("https%3A%2F%2Fmywebsite%2Fdocs%2Fenglish%2Fsite%2Fmybook.do%3Frequest")); 

    } 

}

您可以明智地选择方法：)

#9 楼

使用java.net.URI类：

public String getDecodedURL(String encodedUrl) {
    try {
        URI uri = new URI(encodedUrl);
        return uri.getScheme() + ":" + uri.getSchemeSpecificPart();
    } catch (Exception e) {
        return "";
    }
}

请注意，异常处理可能会更好，但是与该示例无关。

编程黑洞网