Helping Geeks Win the War on Technology Since 2010

Android Function to Fetch HTML Page from Any URL

April 17, 2011

Making use of the Internet connectivity on the Android platform often means having to access web pages and reading the data from it for your application. To achieve this, I wrote a function that takes in any URL to fetch the HTML contents from. What this function will give you is the HTML source code for a website. It will then be up to your app to parse the data properly to retrieve the information you need. I used regular expression matching to strip out the unnecessary bits and get right down to the content I wanted to display.

Here is a simple function that I use below. It did end up a bit slow accessing the URLs I sent it, so I eventually put the HTML processing on my own web server and then had my Android application simply fetch the data from there. You may get a dialog pop up in your app when using this if the access takes too long. It’s fine usually on wi-fi, but 3G can be a bit slow sometimes. As with any code, test before you deploy to see if it suits your needs. At the very least, you have a basis to use for tweaking in case you want to build in some slow response handling.

public String getURLContent(String url)
{
    try {
        DefaultHttpClient httpClient = new DefaultHttpClient();
        HttpGet httpGet = new HttpGet(url);            
        ResponseHandler<String> resHandler = new BasicResponseHandler();
        String page = httpClient.execute(httpGet, resHandler);
        return page;
    } catch (ClientProtocolException e) {
        return "";
    }  catch (IOException e) {
        return "";
    }
}

posted in Android by helpgeek

Follow comments via the RSS Feed | Leave a comment | Trackback URL

6 Comments to "Android Function to Fetch HTML Page from Any URL"

  1. Dich Thuat wrote:

    How i can make app on android to read magazine automatic?

  2. cho phan mem wrote:

    Its ok. You dont need to do this. You can use XML Parse. Now i have 2 apps which read magazine. If you want get it, please send email for me. he he

  3. Android/Java Function for Regular Expression Search Like preg_match | Geek Help Guide wrote:

    [...] Pages as Inline List without CSSAndroid Saving Persistent Data Between Application SessionsAndroid Function to Fetch HTML Page from Any URLWhy No Comments Are Showing on isoHuntHide or Disable the Admin Bar in WordPress 3.1Fix Blank Admin [...]

  4. diya wrote:

    i want to get only links or href values of any website.
    how i can i achieve it?

  5. helpgeek wrote:

    Use regular expressions. It’s all explained in this post. Simply change the HTML markup to the pattern you need, in this case href.

  6. Daksh wrote:

    while using the above (or any other) logic to fetch HTML source, I encounter this weird problem that the HTML source is fetched only upto a certain limit.

    While using an InputStream, it ends at 4104. and while using ResponseHandler, it ends at 5616. do you have any idea as to why that is happening?

    I’m really at a stand still because of this problem!

Leave Your Comment

 
Powered by Wordpress and MySQL on MDDHosting. Theme by Shlomi Noach, openark.org