APPLICATION PROFILING


Application profiling Addresses the unique structure, logic, and features of an individual, highly customized web application. Application vulnerabilities may be subtle and may take substantial research to detect and exploit.

The purpose of surveying the application is to generate a complete picture of the content, components,
function, and flow of the web site in order to gather clues about where underlying vulnerabilities might be.
Whereas an automated vulnerability checker typically searches for known vulnerable URLs, the goal of an
extensive application survey is to see how each of the pieces fit together. A proper inspection can reveal problems with aspects of the application beyond the presence or absence of certain traditional vulnerability signatures.

Cursorily, application profiling is easy. You simply crawl or click through the application and pay attention to the URLs and how the entire web site is structured.

You should be able to recognize :

  • Site Language
  • Site structure
  • Dynamic content

web application profiling comprised of the following key tasks:

  • Manual inspection
  • Search tools
  • Automated crawling
  • Common web application profiles

Manual Inspection

  • Clicking through links
  • Becoming familiar with the site
  • Looking for all the menus, directory names
  • Looking at URL field
  • Documenting the application structure

Documenting the Application structure

  • Page name
  • Full path to the page
  • Require authentication? Yes or no.
  • Require SSL? Remove 's' in https & try to access page
  • GET/POST arguments
  • Comments
  • search, admin function or a Help page.       
  • Does the page “feel” insecure?
  • Statically and dynamically generated pages
  • Directory structure
  • Common files extensions
  • Common files
  • Helper files
  • Java classes and applets
  • Flash and Silverlight objects
  • HTML source code
  • Forms
  • Query strings and parameters
  • Common cookies
  • Back-end access points

Statically and Dynamically Generated Pages

Static pages are the generic .html file source may contain comments or information.

Dynamically generated pages (.asp, .jsp, .php, etc.) May contain “administrator functions,” “ user profile information,” or “cart view.” it’s a good idea to mirror the structure and content of the application to local disk.

Directory Structure

The web server may have directories for :

  • Administrators /admin/ /secure/ /adm/
  • Old versions of the site /archive/ /old/
  • Backup directories /.bak/ /backup/ /back/
  • Log directories / log/ /logs/
  • Data directories
  • Other directories
  • Personal Apache directories: /~root/ /~bob/ /~cthulhu/
  • Directories for include files : /include/ /inc/ /js/ /global/ /local/
  • Directories used for internationalization: /de/ /en/ /1033/ /fr/

OWASP Dirbuster can be used to do this.

Common Extensions Files

Extensions are a great indicator of the nature of an application. File extensions are used to determine the type of file, either by language or its application association. File extensions also tell web servers how to handle the file.We can just try searching the extension using an Internet search engine like Google (for example, using the syntax “allinurl:.cfm”). This will allow you to identify other sites that may use that extension, which can help you narrow down what applications the extension is associated with.

Common Files

Most software installations will come with a number of well-known files, for instance:

   • Readme
   • ToDo
   • Changes
   • Install.txt
   • EULA.txt

By searching every folder and subfolder in a site, you might just hit on plenty of useful information that will tell you what applications and versions are running and a nice URL that will lead you to a download page for software and updates. If you don’t have either the time or the ability to check every folder, you should always be sure to at least hit the site’s root directory where these file types are often held (for example, http://www.site.com/Readme.txt). 

Most administrators or developers will follow a default install, or they will unzip the entire contents of the archive right into the web root.

Helper Files

Helper file is a catch-all appellation for any file that supports the application but usually does not appear in the URL. Common “helpers” are JavaScript files. They are often used to format HTML to fit the quirks of popular browsers or perform client-side input validation.

• Cascading Style Sheets CSS fi les (.css) : rarely contain sensitive information
• XML Style Sheets (.xsl) : wealth of information, often listing database fields or referring to other helper files.
• JavaScript Files(.js): Functions
• Include Files: control database access
• ASP, PHP, Perl, text, and other files

URLs rarely refer to these files directly, so you must turn to the HTML source in order to find them. Look for these files in Server side include directives and script tags.

Java Classes and Applets

Applets seem to be some of the most insecure pieces of software. Most developers take no consideration of the fact that these can easily be decompiled and give up huge amounts of information. 

If you can download the Java classes or compiled servlets, then you can actually pick apart an application from the inside. Imagine if an application used a custom encryption scheme written in a Java servlet. Now, imagine you can download that servlet and peek inside the code.

Finding applets in web applications is fairly simple: just look for the applet tag code that looks like this:

<applet code = "MainMenu.class"
codebase="http://www.site.com/common/console" id = "scroller">
<param name = "feeder" value
="http://www.site.com/common/console/CWTR1.txt">
<param name = "font" value = "name=Dialog, style=Plain, size=13">
<param name = "direction" value = "0">
<param name = "stopAt" value = "0">
</applet>

Having access to the internal functions of the site enables you to inspect database calls, file formats, input validation (or lack thereof), and other server capabilities.

You may find it difficult to obtain the actual Java class, but try a few tricks such as these:

• Is the site uses a servlet called “/servlet/LogIn”, then look for “/servlet/LogIn.class”.
• Search for servlets in backup directories.

HTML Source Code

HTML source code can contain comments where the authors often place informal remarks that can be quite revealing,  they may be rare and contain descriptions of a database table for a subsequent SQL query, or worse yet, user passwords. You can do this by web crawling tools Also Try searching for strings:

SQL Select Insert
#include #exec Password
Catabase Connect //

Another interesting thing to search for in HTML are tags that denote server-side execution, such as <? and ?> for PHP, and <% and %> and <runat=server> for ASP pages. These can reveal interesting tidbits that the site developer never intended the public to see.

Forms

When manually inspecting an application, note every page with an input field. You can find most of the forms by a click-through of the site. However, visual confirmation is not enough. Once again, you need to go to the source.

You can download the source code and search for "input type"

<input type="text" name="name" size="10" maxlength="15">
<input type="password" name="passwd" size="10" maxlength="15">
<input type=hidden name=vote value="websites">
<input type="submit" name="Submit" value="Login">

This form shows three items: a login field, a password field, and the submit button with the text, “Login.” The HTML source reveals a fourth field called “name.” An application may use hidden fields for several purposes, most of which seriously inhibit the site’s security. Session handling, user identification, passwords, item costs, and other sensitive information tend to be put in hidden fields.

Newer web browsers support an auto complete function that saves users from entering the same information every time they visit a web site.However, the auto complete function is usually set to “off” for password fields:

<input type=text name="val2"
size="12" autocomplete=off>

This might indicate that "val2" is a password field. At the very least, it appears to contain sensitive information that the programmers explicitly did not want the browser to store. In this instance, the fact that type="password" is not being used is a security issue, as the password will not be masked when a user enters her data into the field. So when inspecting a page’s form, make notes about all of its aspects.

Query Strings and Parameters

Perhaps the most important part of a given URL is the query string, the part following the question mark (in most cases) that indicates some sort of arguments or parameters being fed to a dynamic executable or library within the application. An example is shown here:

http://www.site.com/search.cgi?searchTerm=test

This shows the parameter searchTerm with the value test being fed to the search.cgi executable on this site. Query strings and their parameters are perhaps the most important piece of information to collect because they represent the core functionality of a dynamic web application, usually the part that is the least secure because it has the most moving parts.

Query string can be used to manipulate parameter values to attempt to impersonate other users, obtain restricted data, run arbitrary system commands, or execute other actions not intended by the application developers.

Parameter names may also provide information about the internal workings of the application. They may represent database column names, be obvious session IDs, or contain the username. The application manages these strings, although it may not validate them properly.


Common cookies

The URL is not the only place to go to recognize what type of application is running. Application and web servers commonly carry their own specific cookies.


Back-end Access Points

The final set of information to collect is evidence of back-end connectivity. Note that information is read from or written to the database when the application does things like updating address information or changing passwords. Highlight pages or comments within pages that directly relate to a database or other systems.

Certain WebDAV options enable remote administration of a web server. A misconfigured server could allow anyone to upload, delete, modify, or browse the web document root. Check to see if these options are enabled .

Search Tools for Profiling

Search engines have always been a hacker’s best friend. It’s a good bet that at least one of the major Internet search engines has indexed your target web application at least once in the past.

• Use open-source intelligence tool Maltego.
• Search for a specifi c web site : “site:www.victim.com
• Search for pages : related:www.victim.com
• Examine the “cached” results
• Investigate search called similar pages.
• Examine search results containing newsgroup. 
• Search using just the domain name site:victim.com.
This can return “mail.victim.com” or “beta.victim.com”.
• To locate files use “filetype:swf”, Flash SWF files
• Pay close attention to how the application interacts with its URLs
• Look for robots.txt

Automated Web Crawling

The most fundamental and powerful techniques used in profiling is the mirroring of the entire application to a local copy that can be scrutinized slowly and carefully. We call this process web crawling, and web crawling tools are an absolute necessity when it comes to large-scale web security assessments. Your web crawling results will create your knowledge-baseline for your attacks, and this baseline is the most important aspect of any web application assessment. The information you glean will help you to identify the overall architecture of your target, including all of the important details of how the web application is structured, input points, directory structures, and so on. As powerful as web crawling is, it is not without its drawbacks. Here are some things that it doesn’t do very well:

• Forms
Crawlers, being automated things, often don’t deal well with filling in web forms designed for human interaction. For example, a web site may have a multi step account registration process that requires form fill-in. If the crawler fails to complete the first form correctly, it may not be able to reach the subsequent steps of the registration process and will thus miss the privileged pages that the application brings you to once you successfully complete registration.

• Complex flows
Some sites require that a human manually clicks through the site.

• Client-side code
Many web crawlers have diffi culty dealing with client-side code. If your target web site has a lot of JavaScript, there’s a good chance you’ll have to work through the code manually to get a proper baseline of how the application works. This problem with client-side code is usually found in free and cheap web crawlers. Some examples of client-side code include JavaScript, Flash, ActiveX, Java Applets, and AJAX (Asynchronous Java and XML).

• State problems
Attempting to crawl an area within a web site that requires web-based authentication is problematic. So you should profile the authenticated portions of the web site manually or look to a web security assessment product when your target site requires that you maintain state. No freeware crawler will do an adequate job for you.

• Broken HTML/HTTP
A lot of crawlers attempt to follow HTTP and HTML specifi cations when reviewing an application, but a major issue is that no web application follows an HTML specifi cation. In fact, a broken link from a web
site could work in one browser but not another. This is a consistent problem when it comes to an automated product’s ability to identify that a piece of code is actually broken and to automatically remedy the problem so the code works the way Internet Explorer intends.

• Web services
As more applications are designed as loosely coupled series of services, it will become more diffi cult for traditional web crawlers to determine relationships and trust boundaries among domains. Many modern
web applications rely on a web-based API to provide data to their clients. Traditional crawlers will not be able to execute and map an API properly without explicit instructions on how execution should be performed. Despite these drawbacks, we wholeheartedly recommend web crawling as an essential part of the profiling process.

Web Crawling Tools

• Lynx
• wget
• Burp suite spider
• Teleport pro
• Black widow
• Offline Explorer pro
• Common web application profiles

Countermeasures

• Protect directories listing
• Limit the content of location header so it does not display web server's IP.
• Modify the IIS metabase. (adsutil.vbs) D:\Inetpub\adminscripts\adsutil. vbs set w3svc/UseHostName True
D:\Inetpub\adminscripts\net start w3svc
• Apache: stop directory enumeration
• IIS: Place the InetPub directory on a volume different from the system root
•UNIX: web servers Place directories in a chroot environment. (mitigation for traversal attacks.)
• Make sure none of the file contains sensitive information.
• Consolidate all JavaScript fi les to a single directory. Ensure they do not have “execute” permissions (only be read by server, not executed as scripts).
• For IIS, place .inc, .js, .xsl, and other include fi les outside of the web root by wrapping them in a COM object.
• Strip developer comments.
• Use path names relative to the web root or the current directory. Do not use full path names that include drive letters .
• If the site requires authentication, ensure authentication is applied to the entire directory and its subdirectories.

No comments:

Post a Comment

Popular Posts