Advertising Digital Media

Internet marketing and online advertising campaigns with experienced advertising agency for Internet promotion.

Archive for the ‘Spam’ Category

Complement set email filtering

Complement Set Filtering (CSF) is a method for filtering unsolicited bulk email (UBE or spam) The technique utilizes at least two email accounts: the primary account where spam and non-spam is received and secondary accounts that receive only spam. CSF calculates the set theoretic difference between the primary and secondary email sets (email accounts) and identifies email messages contained in both sets.

Implementation

CSF is implemented by comparing message content in a UBE account (separate mailbox or alias) with the message content in a primary account. By definition, messages contained in the UBE account are spam so messages in the primary account that are substantially similar to messages in the UBE account are also spam. When the same message is found in both the primary account and the UBE account, it is deleted from the primary account.

The UBE account is established by creating a mailbox (or alias) incorporating a common first name (to help spammers guess the address) and the domain of the primary account, then exposing the UBE account to the internet. For example, if the primary mailbox is johnm@domain.com, the UBE account might be john@domain.com (see diagram below). After the UBE mailbox is set up, the email address is given to spammers by posting it to message boards, portal groups, “Who Is” listings, ecommerce sites and Usenet.

CSF works especially well in corporate environments where the domain is targeted by spammers and UBE tends to be very similar from mailbox to mailbox. Also, because CSF does not depend on characteristics of past UBE to identify current UBE it is particularly well suited for identifying UBE with new subject matter.

Advantages of CSF

Many spam-filtering techniques search for patterns and known spam subject matter in the headers and bodies of messages. Others use probabilities (Bayesian statistical methods, for example) to identify unwanted messages. CSF is effective as a stand alone filter or can be combined with other techniques.

CSF has at least three advantages over Bayesian and pattern analysis algorithms. First, CSF does not depend on content analysis other than what is required to find similarities between messages in the primary and UBE accounts. Second, CSF does not utilize scoring (word ranking) that can be circumvented with message obfuscating (V!agra instead of Viagra, for example). Third, CSF takes advantage of the fact most UBE contains identical message content, particularly messages targeted at specific corporate domains.

This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia.

Share

Bogofilter

Bogofilter is a mail filter that classifies e-mail as spam or ham (non-spam) by a statistical analysis of the message’s header and content (body). The program is able to learn from the user’s classifications and corrections. It was originally written by Eric S. Raymond, and is now maintained together with a group of contributors including but not limited to Adrian Otto, Matthias Andree, Matt Martini and David Relson.

The statistical technique used is known as Bayesian filtering and its use for spam was first described by Paul Graham in his article A Plan For Spam. Gary Robinson, in his weblog Rants, suggests some refinements for improved discrimination between spam and ham. Bogofilter’s primary algorithm uses the f(w) parameter and the Fisher inverse chi-square technique that he describes.

Bogofilter is run by an MDA script to classify an incoming message as spam or ham (using wordlists stored by BerkeleyDB). Bogofilter provides processing for plain text and HTML. It supports multi-part MIME message with decoding of base64, quoted-printable, and uuencoded text and ignores attachments, such as images.

Bogofilter is written in C, and runs on Linux, FreeBSD, Solaris, Mac OS X, HP-UX, AIX and other platforms.

Links

This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia.

Share

Markovian discrimination

Markovian discrimination in spam filtering is a method used in CRM114 and other spam filters to model the statistical behaviors of spam and nonspam more accurately than in simple Bayesian methods. A simple Bayesian model of written text contains only the dictionary of legal words and their relative probabilities. A Markovian model adds the relative transition probabilities that given one word, predict what the next word will be. It is based on the theory of Markov chain by Andrei Markov, hence the name. In essence, a Bayesian filter works on single words alone, while a Markovian filter works on phrases or entire sentences.

There are two types of Markov models; the visible Markov model, and the Hidden Markov Model or HMM. The difference is that with a visible Markov model, the current word is considered to contain the entire state of the language model, while a hidden Markov model hides the state and presumes only that the current word is probabalistically related to the actual internal state of the language.

For example, in a visible Markov model the word “the” should predict with accuracy the following word, while in a hidden Markov model, the entire prior text implys the actual state and predicts the following words, but does not actually guarantee that state or prediction. Since the latter case is what’s encountered in spam filtering, hidden Markov models are almost always used. In particular, because of storage limitations, the specific type of hidden Markov model called a Markov random field is particularly applicable, usually with a clique size of between four and six tokens.

This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia.

Share

Bayesian spam filtering

Bayesian spam filtering is the process of using Bayesian statistical methods to classify documents into categories.

Bayesian filtering was proposed by Sahami et al. (1998) and gained attention in 2002 when it was described in the paper A Plan for Spam by Paul Graham. Since then it has become a popular mechanism to distinguish illegitimate spam email from legitimate email. Many modern mail programs such as Mozilla Thunderbird implement Bayesian spam filtering. Server-side email filters, such as SpamAssassin and ASSP, make use of Bayesian spam filtering techniques, and the functionality is sometimes embedded within mail server software itself.

Advantages

The advantage of Bayesian spam filtering is that it can be trained on a per-user basis.

The spam that a user receives is often related to the online user’s activities. For example, a user may have been subscribed to an online newsletter that the user considers to be spam. This online newsletter is likely to contain words that are common to all newsletters, such as the name of the newsletter and its originating email address. A Bayesian spam filter will eventually assign a higher probability based on the user’s specific patterns.

The legitimate e-mails a user receives will be tend to be different. For example, in a corporate environment, the company name and the names of clients or customers will be mentioned often. The filter will assign a lower spam probability to emails containing those names.

The word probabilities are unique to each user and can evolve over time with corrective training whenever the filter incorrectly classifies an email. As a result, Bayesian spam filtering accuracy after training is often superior to pre-defined rules.

It can perform particular well in avoiding false negatives, where legitimate email is incorrectly classified as spam. For example, if the email contains the word “Nigeria”, which frequently appeared in a long spam campaign, a pre-defined rules filter might reject it outright. A Bayesian filter would mark the word “Nigeria” as a probable spam word, but would take into account other important words that usually indicate legitimate e-mail. For example, the name of a spouse may strongly indicate the e-mail is not spam, which could overcome the use of the “Nigeria.”

Some spam filters combine the results of both Bayesian spam filtering and pre-defined rules resulting in even higher filtering accuracy. Recent spammer tactics include insertion of random innocuous words that are not normally associated with spam, thereby decreasing the email’s spam score, making it more likely to slip past a Bayesian spam filter.

Links

References

  • (Sahami et al., 1998): M. Sahami, S. Dumais, D. Heckerman, E. Horvitz: A Bayesian approach to filtering junk e-mail, AAAI’98 Workshop on Learning for Text Categorization, 1998.

This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia.

Share

Content filtering

Content filtering is the most commonly used group of methods to filter spam. Content filters act either on the content, the information contained in the mail body, or on the mail headers (like “Subject:”) to either classify, accept or reject a mail.

The most popular filter is the Bayesian filter, which is a statistical filter.

Usually Anti-Virus methods can be classified as content filters too, since they scan (simplyfied) either the binary attachments of mail or the HTML contents.

Common content filters are:

Bayesian
Attachment
Mail header
Mailing List
HTML anomalies
Language
Heuristic
Regular Expression
Phrases
Proximity
URL
Content-encoding
Char-set

Share

Anti-spam appliances

Deployed at the gateway or in front of the mail server, anti-spam appliances are hardware-based solutions integrated with on-board anti-spam software and are normally driven by an operating system optimized for spam filtering. They are generally used in larger networks such as companies and corporations, ISPs, universities, etc.

Often anti-spam appliances are selected instead of software only solutions because of the following reasons:

  • Customer prefers to buy hardware instead of software
  • ease of installation
  • operating system requirements (e.g. company policy requires Linux, but software is not available under this OS)
  • independence of existing hardware

Links

This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia.

Share

Spam reduction tools

  • Mozilla and the stand-alone Thunderbird: e-mail programs (“clients”) with a Bayesian filter, i.e. a filter that keeps learning and is therefore able to adapt to the constantly changing forms of spam
  • Disposable e-mail accounts, various types for registering on web sites etc.
    • E4ward.com You can use your own domain name or e4ward.com for your aliases
      Sneakemail original disposable email address service
      Spamgourmet expire after a number of emails, but can be reset or ignored for some senders
      Jetable expiring in 1-8 days
      Mailinator instant email accounts, self-destructing email after you read it.
      shortMail.net expiring email forwarding accounts, and instant anonymous online email
      SpamDay allows you to create forward addresses and webmail addresses, valid for 24 hours. Support for RSS feed!
      SpamMotel Use it whenever you are required to give out your e-mail address on the internet.
      ipoo.org Signups without spam. Fast, no ads. Includes RSS to check your SPAM inbox.
  • Tools to filter out spam
    • Bogofilter Statistical filter (not strictly Bayesian)
      Firetrust MailWasher Pro. Removes spam while it is still on your POP3 server.
      Hexamail Guard – Anti-spam gateway software
      iMailLight smart plugin for Outlook, based on Bayesian filtering
      SpamAssassin heuristic filter
      CRM114 Uses a hidden Markov model to classify spam
      SpamBayes Bayesian filter using ideas improving Paul Graham’s ideas.
      Spamihilator Free antispam program with a good-working bayesan filter and a lot of other filters plugins. It works with almost all email program.
      SpamPal Free Windows filter with lots of filtering methods. Client or server-side filtering.
      TMDA, a challenge/response system
      trimMail Inbox – Anti-spam firewall
      Checksum-based filter:
      Distributed Checksum Clearinghouse
      Vipul’s razor
  • Tools to filter out viruses
    • Clam antivirus
  • Contact forms that hide email addresses
    • Contact Form – Open source (GPL) – Requires a webserver, Perl, and Sendmail
      form2mail – Open source (GPL) – Requires a webserver, PHP, MySQL, and SendMail
      MailWebForm Open source (GPL)- Requires Java, Java Servlets, and Java Mail
      SCForm – Open source (GPL) – Requires a websever, PHP and Sendmail
  • Other tools
    • Sam Spade program with tools
      SpamCop a place to report spam
  • Services which guarantee messages as not being spam:
    • Habeas Sender Warranted Email
      Bonded Sender
  • Making it harder to harvest e-mail addresses
    • Project Honey Pot
      address-protector.com A service to protect email addresses with image and audio captchas
      SpamFreeze allows users to post a URL online instead of their email address

This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia.

Share

Examination of anti-spam methods

There are a number of services and software systems that mail sites and users can use to reduce the load of spam on their systems and mailboxes. Some of these depend upon rejecting email from Internet sites known or likely to send spam. Others rely on automatically analyzing the content of email messages and weeding out those which resemble spam. These two approaches are sometimes termed blocking and filtering.

Blocking and filtering each have their advocates and advantages. While both reduce the amount of spam delivered to users’ mailboxes, blocking does much more to alleviate the bandwidth cost of spam, since spam can be rejected before the message is transmitted to the recipient’s mail server. Filtering tends to be more thorough, since it can examine all the details of a message. Many modern spam filtering systems take advantage of machine learning techniques, which vastly improve their accuracy over manual methods. However, some people find filtering intrusive to privacy, and many mail administrators prefer blocking to deny access to their systems from sites tolerant of spammers.

DNSBLs

DNS-based Blackhole Lists, or DNSBLs, are used for heuristic filtering and blocking. A site publishes lists (typically of IP addresses) via the DNS, in such a way that mail servers can easily be set to reject mail from those sources. There are literally scores of DNSBLs, each of which reflects different policies: some list sites known to emit spam; others list open mail relays or proxies; others list ISPs known to support spam. Other DNS-based anti-spam systems list known good (“white”) or bad (“black”) IPs domains or URLs, including RHSBLs and URIBLs. For history, details, and examples of DNSBLs, see DNSBL.

Content-based filtering

Until recently, content filtering techniques relied on mail administrators specifying lists of words or regular expressions disallowed in mail messages. Thus, if a site receives spam advertising “herbal Viagra”, the administrator might place these words in the filter configuration. The mail server would thence reject any message containing the phrase.

Content based filtering can also filter based on content other than the words and phrases that make up the body of the message. Primarily, this means looking at the header of the email, the part of the message that contains information about the message, and not the body text of the message. Spammers will often spoof fields in the header in order to hide their identities, or to try to make the email look more legitimate than it is; many of these spoofing methods can be detected. Also, spam sending software often produces a header that violates the RFC 2822 standard on how the email header is supposed to be formed.

Disadvantages of this static filtering are threefold: First, it is time-consuming to maintain. Second, it is prone to false positives. Third, these false positives are not equally distributed: manual content filtering is prone to reject legitimate messages on topics related to products advertised in spam. A system administrator who attempts to reject spam messages which advertise mortgage refinancing may easily inadvertently block legitimate mail on the same subject.

Finally, spammers can change the phrases and spellings they use, or employ methods to try to trip up phrase detectors. This means more work for the administrator. However, it also has some advantages for the spam fighter. If the spammer starts spelling “Viagra” as “V1agra” or “Via_gra”, it makes it harder for the spammer’s intended audience to read their messages. If they try to trip up the phrase detector, by, for example, inserting an invisible-to-the-user HTML comment in the middle of a word (“Via<!—->gra”), this sleight of hand is itself easily detectable, and is a good indication that the message is spam. And if they send spam that consists entirely of images, so that anti-spam software can’t analyze the words and phrases in the message, the fact that there is no readable text in the body can be detected.

However, content filtering can also be implemented by examining the URLs present (i.e. spamvertised) in an email message. This form of content filtering is much harder to disguise as the URLs must resolve to a valid domain name. Extracting a list of such links and comparing them to published sources of spamvertised domains is a simple and reliable way to eliminate a large percentage of spam via content analysis.

Statistical filtering

Statistical filtering was first proposed in 1998 by Mehran Sahami et al., at the AAAI-98 Workshop on Learning for Text Categorization. A statistical filter is a kind of document classification system, and a number of machine learning researchers have turned their attention to the problem. Statistical filtering was popularized by Paul Graham’s influential 2002 article A Plan for Spam, which proposed the use of naive Bayes classifiers to predict whether messages are spam or not – based on collections of spam and nonspam (“ham”) email submitted by users. [1]

Statistical filtering, once set up, requires no maintenance per se: instead, users mark messages as spam or nonspam and the filtering software learns from these judgements. Thus, a statistical filter does not reflect the software author’s or administrator’s biases as to content, but it does reflect the user’s biases as to content; a biochemist who is researching Viagra won’t have messages containing the word “Viagra” flagged as spam, because “Viagra” will show up often in his or her legitimate messages. A statistical filter can also respond quickly to changes in spam content, without administrative intervention.

Spammers have attempted to fight statistical filtering by inserting many random but valid “noise” words or sentences into their messages while attempting to hide them from view, making it more likely that the filter will classify the message as neutral. Attempts to hide the noise words include setting them in tiny font or the same colour as the background. However, these noise countermeasures seem to have been largely ineffective.

Software programs that implement statistical filtering include Bogofilter, the e-mail programs Mozilla and Mozilla Thunderbird, and later revisions of SpamAssassin. Another interesting project is CRM114 which hashes phrases and does bayesian classification on the phrases.

There is also the free mail filter POPFile [2] which sorts mail in as many categories as you want (family, friends, co-worker, spam, whatever) with bayesian filtering.

Checksum-based filtering

Checksum-based filter takes advantage of the fact that often, for any individual spammer, all of the messages he or she sends out will be mostly identical, the only differences being web bugs, and when the text of the message contains the recipient’s name or email address. Checksum-based filters strip out everything that might vary between messages, reduce what remains to a checksum, and look that checksum up in a database which collects the checksums of messages that email recipients consider to be spam (some people have a button on their email client which they can click to nominate a message as being spam); if the checksum is in the database, the message is likely to be spam.

The advantage of this type of filtering is that it lets ordinary users help identify spam, and not just administrators, thus vastly increasing the pool of spam fighters. The disadvantage is that spammers can insert unique invisible gibberish — known as hashbusters — into the middle of each of their messages, thus making each message unique and having a different checksum. This leads to an arms race between the developers of the checksum software and the developers of the spam-generating software.

Checksum based filtering methods include:

  • Distributed Checksum Clearinghouse
  • Vipul’s Razor

Authentication and Reputation (A&R)

A number of systems have been proposed to allow acceptance of email from servers which have authenticated in some fashion as senders of only legitimate email. Many of these systems use the DNS, as do DNSBLs; but rather than being used to list nonconformant sites, the DNS is used to list sites authorized to send email, and (sometimes) to determine the reputation of those sites. Other methods of identifying ham and spam are still used. The A&R allows much ham to be more reliably identified, which allows spam detectors to be made more sensitive without causing more false positive results. The increased sensitivity allows more spam to be identified as such. Also, A&R methods tend to be less resource-intensive than other filtering methods, which can be skipped for messages identified by A&R as ham.

Sender-supported whitelists and tags

There are a small number of organizations which offer IP whitelisting and/or licensed tags that can be placed in email (for a fee) to assure recipients’ systems that the messages thus tagged are not spam. This system relies on legal enforcement of the tag. The intent is for email administrators to whitelist messages bearing the licensed tag.

A potential difficulty with such systems is that the licensing organization makes its money by licensing more senders to use the tag — not by strictly enforcing the rules upon licensees. A concern exists that senders whose messages are more likely to be considered spam who would accrue a greater benefit by using such a tag. The concern is that these factors form a perverse incentive for licensing organizations to be lenient with licensees who have offended. However, the value of a license would drop if it was not strictly enforced, and financial gains due to enforcement of a license itself can providee an additional incentive for strict enforcement. The Habeas mail classing system attempts to further address this issue this by classing email according to origin, purpose, and permission. The purpose is to describe why the email is not likely spam, but permission based email.

Ham passwords

Another approach for countering spam is to use a “ham password”. Systems that use ham passwords ask unrecognised senders to include in their email a password that demonstrates that the email message is a “ham” (not spam) message. Typically the email address and ham password would be described on a web page, and the ham password would be included in the “subject” line of an email address. Ham passwords are often combined with filtering systems, to counter the risk that a filtering system will accidentally identify a ham message as a spam message.

The “plus addressing” technique appends a password to the “username” part of the email address.

Cost-based systems

Since spam occurs primarily because it is so cheap to send, a proposed set of solutions require that senders pay some cost in order to send spam, making it uneconomic.

Stamps

Some gatekeeper such as Microsoft would sell electronic stamps, and keep the proceeds. Or a Micropayment, such as Electronic money would be paid by the sender to the recipient or their ISP, or some other gatekeeper.

Hashcash

Hashcash and similar systems require that a sender pay a computational cost by performing a calculation that the receiver can later verify. Verification must be much faster than performing the calculation, so that the computation slows down a sender but does not significantly impact a receiver. The point is to slow down machines that send most of spam — often millions and millions of them. While every user that wants to send email to a moderate number of recipients suffers just a seconds’ delay, sending millions of emails would take an unaffordable amount of time.

Bonds

As a refinement to stamp systems was the idea of requiring that the micropayment only be retained if the recipient considered the email to be abusive. This addressed the principal objection to stamp systems: popular free legitimate mailing list hosts would be unable to continue to provide their services if they had to pay postage for every message they sent out.

Issues

A difficulty that must be dealt with by most anti-spam methods, including DNSBLs, Authentication and Reputation (A&R), Sender-supported whitelists and tags, Ham passwords, cost-based systems, Heuristic filtering, and Challenge/response systems is that spammers already (illegally) use other people’s computers to send spam. The computers in question are already infected with viruses and spyware operated by the spam senders, in some cases seriously damaging the computer’s responsiveness to the legitimate user. Spam from the legitimate user’s computer can be sent using the user’s and/or system’s identity, list of correspondents, reputation, credentials, stamps, hashcash and/or bonds. The added motivation to steal from such systems in order to abuse these things may simply impel spammers to infect more computers and cause greater damage. On the other hand, this could compel computer users to finally secure their systems, reducing Botnets, which would have myriad other benefits, as they are used for extortion, phishing, and terorrism, as well as spam. Ultimately, any system that holds senders responsible for the mail they send needs to deal with the situation of irresponsible senders that may send both spam and ham.

Heuristic filtering

Heuristic filtering, such as is implemented in the program SpamAssassin, uses some or all of the various tests for spam mentioned above, and assigns a numerical score to each test. Each message is scanned for these patterns, and the applicable scores tallied up. If the total is above a fixed value, the message is rejected or flagged as spam. By ensuring that no single spam test by itself can flag a message as spam, the false positive rate can be greatly reduced. [3]

Tarpits and Honeypots

A tarpit is any server software which intentionally responds pathologically slowly to client commands. A honeypot is a server which attempts to attract attacks. Some mail administrators operate tarpits to impede spammers’ attempts at sending messages, and honeypots to detect the activity of spammers. By running a tarpit which appears to be an open mail relay, or which treats acceptable mail normally and known spam slowly, a site can slow down the rate at which spammers can inject messages into the mail facility.

One tarpit design is the teergrube, whose name is simply German for “tarpit.” This is an ordinary SMTP server which intentionally responds very slowly to commands. Such a system will bog down SMTP client software, as further commands cannot be sent until the server acknowledges the earlier ones. Several SMTP MTAs, including Postfix and Exim, have a teergrube capacity built-in: when confronted with a client session which causes errors such as spam rejections, they will slow down their responding [4]. A similar approach is taken by TarProxy.

Another design for tarpits directly controls the TCP/IP protocol stack, holding the spammer’s network socket open without allowing any traffic over it. By reducing the TCP window size to zero, but continuing to acknowledge packets, the spammer’s process may be tied up indefinitely. This design is more difficult to implement than the former. Aside from anti-spam purposes, it has also been used to absorb attacks from network worms. [5]

As of late 2005 much of the spam sent is through so-called “zombie” systems, of which there are potentially a very large number. This makes the actual effectiveness of tarpits questionable, as there are so many spam sources that slowing just a few has little real effect on the volume of spam received.

Another approach is simply an imitation MTA (open relay honeypot) which gives the appearance of being an open mail relay. Spammers who probe systems for open relay will find such a host and attempt to send mail through it, wasting their time and potentially revealing information about themselves and the source of spam to the unexpected alert entity (in comparison to the anticipated careless or unskilled operator typically in charge of open relay MTA systems) that operates the honeypot. Such a system may simply discard the spam attempts, submit them to DNSBLs, or store them for analysis. It may be possible to examine or analyze the intercepted spam to find information that allows other countermeasures. (One honeypot operator was able to alert a freemail supplier to a large number of accounts that had been created as dropboxes for the receipt of responses to spam. Disabling these dropbox email accounts made the entire spam run, including the spam messages relayed through actual open relays, useless to the spammer: he could not receive any of the responses to the spam sent by gullible customers.) The SMTP honeypot may also selectively deliver relay test messages to give a stronger appearance of open relay (though care is needed here as this means the honeypot itself and the network it is on could end up on spam blacklists). SMTP honeypots of this sort have been suggested as a way that end-users can interfere with spammers’ activities (code: Java [6], Python [7]).

As of late 2005 open relay abuse to send spam has greatly declined, resulting in a lowered active effectiveness of open relay honeypots. (Passively, the honeypots or threat of same create an inducement for spammers to not abuse open relays.) Other types of honeypot (below) may still have great effectiveness.

Spammers also abuse open proxies, and open proxy honeypots (proxypots) have had substantial success. Ron Guillmette reported in 2003 that he succeeded in getting over 100 spammer accounts terminated in under 3 months, using his network (of unspecified size) of proxypots. At that time spammers were so careless that they sent spam directly from their servers to the abused open proxy, making determination of the identity of the spammer’s IP address trivial so that it was easy to report the spammer to the ISP in control of that IP address and easy for that ISP to terminate the spammer’s account.

Unlike most other anti-spam techniques tarpits and honeypots work at the relay, proxy, or zombie (collectively, “abuse”) level. They work by targeting spammer behavior rather than targeting spam content. One beneficial fallout from this is that these tools are not required to have any means of distinguishing spam from non-spam. Because they capture spam at the abuse level they are not part of any legitimate email pathway and it can be confidently assumed that what they capture is 100% spam or spam-related (e.g., test messages.) Anti-spam measures at (or after) the destination server level protect specific email addresses but must include code to distinguish spam from non-spam. Anti-spam measures at the abuse level protect whatever the email addresses are that are being targeted by the spam directed through them and are hence non-specific but need no code to distinguish spam from non-spam. The main purpose of abuse-level tools is targeting spam and spammers themselves while the main purpose of server-level tools is to protect speecific email addresses. What abuse-level tools lose in specificity may be more than made up by the inherent simplicity that results from not having to be able to separate valid email from invalid email.

In late 2005 Microsoft announced that it had converted an actual zombie system to a zombie honeypot. One result of this was a lawsuit by Microsoft against about 20 defendants, based on evidence collected by the zombie honeypot.

Note that there is some terminological confusion. Some people refer to “spamtraps” as “honeypots.” In this context a “spamtrap” is an email address created specifically to attract spam. These run at the destination level rather than at the relay, proxy or “spam zombie” level.

Challenge/response systems

Another method which may be used by internet service providers (or by specialized services) to combat spam is to require unknown senders to pass various tests before their messages are delivered. These strategies are termed challenge/response systems or C/R, are currently controversial among email programmers and system administrators.

For a discussion of the advantages and disadvantages of these systems.

This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia.

Share

Protection against spam

End users can protect themselves from the brunt of spam’s impact in numerous ways.

Preventing Address Harvesting

Preventing spammers from obtaining your email address doesn’t really solve the spam problem, any more than avoiding all but lowest crime areas of a city solves crime. Many people cannot hide their email addresses and most people want to meet new people via email. They just don’t want the flood of spam. It may, however, reduce the amount of spam that you receive.

One way that spammers obtain email addresses to target is to trawl the Web and Usenet for strings which look like addresses, using a spambot. Contact forms and address munging are good ways to prevent email addresses from appearing on these forums. If the spammers can’t find the address, the address won’t get spam.

There are other ways that spammers can get addresses such as “dictionary attacks” in which the spammer generates a number of likely-to-exist addresses out of names and common words. For instance, if there is someone with the address adam@example.com, where ‘example.com’ is a popular ISP or mail provider, it is likely that he frequently receives spam.

Address munging

Posting anonymously, or with an entirely faked name and address, is one way to avoid this “address harvesting”, but users should ensure that the faked address is not valid. Users who want to receive legitimate email regarding their posts or Web sites can alter their addresses in some way that humans can figure out but spammers haven’t (yet). For instance, joe@example.net might post as joeNOS@PAM.example.net, or display his email address as an image instead of text. This is called address munging, from the jargon word “mung” meaning to break.

Contact Forms

Contact forms allow users to send email by filling out forms in a web browser. The web server takes the form data and forwards it to an email address. The user (and therefore the spam harvester) never sees the email address. Contact forms have the drawback that they require a website that supports server side scripts. They are also inconvenient to the message sender as he is not able to use his preferred e-mail client. Finally if the software used to run the contact forms is buggy or badly designed they can become spam tools in their own right.

Disposable e-mail addresses

Many email users sometimes need to give an address to a site without complete assurance that the site will not spam, or leak the address to spammers. One way to mitigate the risk of spam from such sites is to provide a disposable email address — a temporary address which forwards email to your real account, but which you can disable or abandon whenever you see fit.

A number of services provide disposable address forwarding. Addresses can be manually disabled, can expire after a given time interval, or can expire after a certain number of messages have been forwarded. Some of these services allow easier creation of disposable addresses via various techniques.

Defeating Web bugs and JavaScript

Many modern mail programs incorporate Web browser functionality, such as the display of HTML, URLs, and images. This can easily expose the user to pornographic or otherwise offensive images in spam. In addition, spam written in HTML can contain JavaScript programs to direct the user’s Web browser to an advertised page, or to make the spam message difficult or impossible to close or delete. In some cases, spam messages have contained attacks upon security vulnerabilities in the HTML renderer, using these holes to install spyware. (Some computer viruses are borne by the same mechanisms.) Also, the HTML can be used to signal whether a spam message is actually read and seen by a user.

Users can defend against these methods by using mail clients which do not automatically display HTML, images or attachments, or by configuring their clients not to display these by default.

Avoiding responding to spam

It is well established that some spammers regard responses to their messages — even responses which say “Don’t spam me” — as confirmation that an email address refers validly to a reader. Likewise, many spam messages contain Web links or addresses which the user is directed to follow to be removed from the spammer’s mailing list.

In several cases, spam-fighters have tested these links and addresses and confirmed that they do not lead to the recipient address’s removal — if anything, they lead to more spam.

In late 2003, the USA FTC launched a public relations campaign to encourage email users to simply never respond to a spam email — ever. This campaign stemmed from the tendency of casual email users to reply to spam, in order to complain and request the spammer to cease sending spam.

Perhaps more significantly, since the sender address fields borne by spam messages are almost always forged, a reply to a spam message is likely to reach an innocent third party if it reaches anyone at all.

In Usenet, it is widely considered even more important to avoid responding to spam. Many ISPs have software that seeks out and destroys duplicate messages. Someone may see a spam and respond to it before it is cancelled by their server, which can have the effect of reposting the spammer’s spam for them; since it is not just a duplicate, this reposted copy will last longer.

Reporting spam

The majority of ISPs explicitly forbid their users from spamming, and eject from their service users who are found to have spammed. Tracking down a spammer’s ISP and reporting the offense often leads to the spammer’s service being terminated. Unfortunately, it can be difficult to track down the spammer — and while there are some online tools to assist, they are not always accurate. Also occasionally spammers own their own netblocks. In this case the abuse contact for the netblock can be the spammer itself and can confirm your address as live.

Examples of these online tools are SpamCop, Network Abuse Clearinghouse and Blue Frog. These provide automated or semi-automated means to report spam to ISPs. Some spam-fighters regard them as inaccurate compared to what an expert in the email system can do; however, most email users are not experts.

Consumers may also forward “unwanted or deceptive spam” to an email address (spam@uce.gov ) maintained by the FTC. The database so collected is used to prosecute perpetrators of various types of scam or deceptive advertising.

Defense against email worms

In the past several years, scores of worm programs have used email systems as a conduit for infection. The worm program transmits itself in an email message, usually as a MIME attachment. In order to infect a computer, the executable worm attachment must be opened. In almost all cases, this means the user must click on the attachment. The worm also requires a software environment compatible with its programming.

Email users can defend against worms in a number of ways, including:

  • Avoiding email client software which supports executable attachments. The most frequently-targeted client software for email worms is Microsoft Outlook and Outlook Express, both of which can easily be made to open executable attachments. However, other Windows-based email software is not immune to worms.
  • Using an operating system which does not provide an environment compatible with present worms. Essentially all current email worms affect only the Microsoft Windows operating system. They cannot execute on Macintosh, Unix, GNU/Linux, or other operating systems. In some cases, it is conceivable that a worm could be written for one of these systems; however, various security features militate against it.
  • Using up-to-date anti-virus software to detect incoming worms and quarantine or delete them before they can take effect.
  • Being skeptical of unsolicited email attachments. Since worms and other email-borne malware arrive in this form, some email users simply refuse to open attachments that the sender has not given them advance notice of.

This article is licensed under the GNU Free Documentation License. It uses material from the Wikipedia.

Share

Web Design & Development
Internet Marketing & Advertising
English-Romanian Translation
Nicolae Sfetcu
E-mail, Tel.: 0745-526896

Follow me on Twitter & Facebook

Custom Search

 

January 2012
M T W T F S S
« Dec    
 1
2345678
9101112131415
16171819202122
23242526272829
3031  
Loading...