net.fortytwo.flow.rdf
Class HTTPUtils

java.lang.Object
  extended by net.fortytwo.flow.rdf.HTTPUtils

public class HTTPUtils
extends Object

Author:
Joshua Shinavier (http://fortytwo.net)

Field Summary
static String ACCEPT
           
static String BODY
           
static String CONTENT_TYPE
           
static String SPARQL_QUERY
           
static String USER_AGENT
           
 
Constructor Summary
HTTPUtils()
           
 
Method Summary
static org.apache.commons.httpclient.HttpClient createClient()
           
static org.apache.commons.httpclient.HttpMethod createGetMethod(String url)
           
static org.apache.commons.httpclient.methods.PostMethod createPostMethod(String url)
           
static org.apache.commons.httpclient.HttpMethod createRdfGetMethod(String url)
           
static org.apache.commons.httpclient.methods.PostMethod createSparqlUpdateMethod(String url)
           
static void setAcceptHeader(org.apache.commons.httpclient.HttpMethod method, String value)
           
static void setAcceptHeader(org.apache.commons.httpclient.HttpMethod method, String[] mimeTypes)
           
static void setContentTypeHeader(org.apache.commons.httpclient.HttpMethod method, String value)
           
static long throttleHttpRequest(org.apache.commons.httpclient.HttpMethod method)
          Enforces crawler etiquette with respect to timing of HTTP requests.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

ACCEPT

public static final String ACCEPT
See Also:
Constant Field Values

BODY

public static final String BODY
See Also:
Constant Field Values

CONTENT_TYPE

public static final String CONTENT_TYPE
See Also:
Constant Field Values

SPARQL_QUERY

public static final String SPARQL_QUERY
See Also:
Constant Field Values

USER_AGENT

public static final String USER_AGENT
See Also:
Constant Field Values
Constructor Detail

HTTPUtils

public HTTPUtils()
Method Detail

createClient

public static org.apache.commons.httpclient.HttpClient createClient()
                                                             throws RippleException
Throws:
RippleException

createGetMethod

public static org.apache.commons.httpclient.HttpMethod createGetMethod(String url)
                                                                throws RippleException
Throws:
RippleException

createPostMethod

public static org.apache.commons.httpclient.methods.PostMethod createPostMethod(String url)
                                                                         throws RippleException
Throws:
RippleException

createRdfGetMethod

public static org.apache.commons.httpclient.HttpMethod createRdfGetMethod(String url)
                                                                   throws RippleException
Throws:
RippleException

createSparqlUpdateMethod

public static org.apache.commons.httpclient.methods.PostMethod createSparqlUpdateMethod(String url)
                                                                                 throws RippleException
Throws:
RippleException

setContentTypeHeader

public static void setContentTypeHeader(org.apache.commons.httpclient.HttpMethod method,
                                        String value)
                                 throws RippleException
Throws:
RippleException

setAcceptHeader

public static void setAcceptHeader(org.apache.commons.httpclient.HttpMethod method,
                                   String value)
                            throws RippleException
Throws:
RippleException

setAcceptHeader

public static void setAcceptHeader(org.apache.commons.httpclient.HttpMethod method,
                                   String[] mimeTypes)
                            throws RippleException
Throws:
RippleException

throttleHttpRequest

public static long throttleHttpRequest(org.apache.commons.httpclient.HttpMethod method)
                                throws RippleException
Enforces crawler etiquette with respect to timing of HTTP requests. That is, it avoids the Ripple client making a nuisance of itself by making too many requests, too quickly, of the same host. TODO: request and respect robots.txt, if present.

Returns:
the amount of time, in milliseconds, that is spent idling for the sake of crawler etiquette
Throws:
RippleException


Copyright © 2007-2014. All Rights Reserved.