Remove spam / duplicate MySQL records from MySQL tables.

Remove MySQL duplicate and spam content from MySQL table:

 

1) Remove duplicate content:

Create a tmp table and create a unique index to the column. then copy the unique content form your table to tmp table.

create table tmp like `table`;
ALTER TABLE `tmp` ADD UNIQUE INDEX(text1, text2, text3, text4, text5, text6);
insert IGNORE into `tmp` select * from `table`;

delete duplicate content from your main table:

Delete from
`table`
where id not in (select id from tmp);

We can also do this with the main table it self, but if you data is too large it will take some time so i prefer above
way :

ALTER IGNORE TABLE
`table`
ADD UNIQUE INDEX (mycolumn1, mycolumn2);

Once the process is complete remove unique index from the column.

2) Remove spam content:

i) Identify the spam first, for me spam contents are those who have certain keywords in their text. These can be links, foul words,
e.t.c so we can remove those lines by.

Delete from
`table`
where `mycolumn` REGEXP 'foul word|http://|buy|purchase'

ii) Remove non english charectors like chinese, russians e.t.c

Delete from
`table`
where `mycolumn` != CONVERT(myclumn USING ASCII)

 

iii)Remove very records which doesnt meet the text length standards, for examples comments column with just 3 char is not a valid comments.

Delete from `table` where LENGTH(mycloumn)<5;
Advertisements

How to create cron Job in Virtualmin

How to create cron jobs in virtualmin / webmin

Steps 1: Login to your virualmin account.

Step 2:  Go to Webmin >> System >> Scheduled Cron Jobs 

In the tab section click on Create a new scheduled cron job.

In from fill the data as follows:

a ) Execute cron job as : < give the username > If cron job is related to selected domain then give the username for that domain or use root.

b) Command: If you want to run the command like ps ( to few running process ) then just type the command. If you need to run file then give full path of the file.

c) Input to command: This you can skip, you only need to give input value if your command needs input while its running.

d) When to execute: Default is Hourly, but you can select your required time frame to run it

Step 3: Click on create to create cron job.