Archive - Tech RSS Feed

Hey @Twitter, here are some suggestions for dealing with spam

Twitter spam

I am befuddled by how @Twitter can miss some blatant cases of spam accounts. So much that I have come close to conclude that these are paid accounts, thus won’t be removed no matter how much they are flagged and/or blocked. Here are some suggestions, based on what I have observed with spammers on Twitter, for spam-matching rules to improve the catch ratio. The accounts I use as examples have been hand-picked, so my points are open to interpretation, and could be way improved with data that Twitter has, such as tweet rate, number of spam flags and blocks, etc. These checks could be triggered in escalating order according to the number of users flagging an account for spam, as an example.

[Update] @Ed has replied to my tweet and part of this post:

I said “I have come close to conclude…”, not that they are paid accounts. But, it defies all logic how an account like @kredits can still be up and running after close to 64.000 (yes, that’s sixty-four thousand) spam tweets that break many of the rules/filters I have written about below. It is not a question of writing an algorithm for every variation, but a set of rules which give individual scores, and a minimum score to suspend an account. Basically, x-spam-score followed by an x-spam-status that determines account suspension, or lack thereof.

Follower to following ratio

Some spam accounts use aggressive follow techniques to try and spread their trash, and this gets reflected by auto-follow bots. The result are accounts with following/followed ratios close to one. Case examples: @Bqe1212 with a ratio of 1.01, or @vidalconsulting with 1.04. Others do not follow this approach, and only follow few accounts, for example, @kredits, with only 168 followers and following 49.

Tweet rate

One case I observed (the account is now suspended, so kudos there) had the particularity that tweets were pushed out every three minutes exactly. Twenty-four hours a day. This is something -very- easy to catch (and equally easy to defeat, but hey, some spammers are dumb).

Tweet content

We can split this check into various sub-checks:

1. Keywords

In the case of @jenlock1014, the word ‘money’ appears in almost every single tweet pushed out. The actual text of the tweets vary, as do the linked URLs, but the keyword is there. Other usual keywords are ‘free’, ‘cash’, and so on.

2. Linking the same URL

In some cases we see links to the same URL in every tweet, such as @Bqe1212, with tweets like:

http://twttr.me/dbxV Q&A: HOW CAN I MAKE MONEY FAST ON THE INTERNET FOR FREE!! NO …: by Chri… http://bit.ly/aJVi6Whttp://twttr.me/dbxV

and

http://twttr.me/dbxV How to Make Money Online With Online Writing Sites: There are many sites … http://bit.ly/cichxl http://twttr.me/dbxV

The target site’s linked short URL is different, but every tweet contains (two in this case!) copies of the same short link. Again, both tweets would also trigger rule #1 above for keywords.

3. Linking the same URL with differing URL shorteners

One technique often used is to spread the target link among various URL shorteners. This is the case of @kredits, which uses snurl.com, ej.uz, short.ie, bit.ly, and others, all of which redirect to the same final URL. A simple check, once an account is flagged for processing, is to follow all shortened URLs and look for patterns. For example:

  • Exactly the same URL.
  • Same host, same path, but varying query string (oft used to track sources).
  • Same host, varying path, but same query string.
  • Same host with both varying path and query string.
  • Varying subdomains of the same host.

A combination of the above can be used to determine a spam score for a set of given URLs. An extra check when fuzzing techniques are used on the final URL is to parse the target site’s content, looking for similar headers, keywords, image URIs, Google Analytics account IDs, etc.

Reaction tweets™

Many times a spammer searches for certain keywords, and sends a reaction tweet when one is found. As an example, when I sent this reply to Ed Shahzade (@Ed) in reply to his tweet about auto-follower bots and spam, I received this other tweet from @atraiskredits:

@mikepuchol Problēmu var atrisināt ātrais kredīts? Izvērtēs kredīta piedāvājumu! Atver www.opencredit.lv un gaidi naudu savā kontā.

Obviously this is not English, and thus it was sent as a blind reply to my tweet mentioning @kredits without caring much about my original language, or wether I understand the content of the tweet.

On a flagged account, it should be very easy to check when response tweets are sent, by accumulating the words used in the original triggering tweets, and testing the occurrence of each word in all, or a high percentage, of them. As another case example, 10 minutes after @djsandman813 was sent this tweet by @kredits, and he replied this, @atraiskredits sent this reaction tweet. Screenshots below in case they go missing:

* OK “Reaction tweets” is not really trademarked, but maybe it should be!

Account aggregation

Spammers can try to avoid being flagged, or delay detection, by spreading their activity across multiple accounts. The way to detect this is to run a check among flagged accounts for the above filters, eg. catching various accounts all sending reaction tweets with the same short URL.

Account name

Many spammers are not too creative and simply throw random words and letters into the account name – this can also be an indicator of a spammer account.

Reaction flags

When a user receives a spam tweet, his initial reaction may usually be to block flag the sender as spam. An accumulation of such flags, particularly with other indicators such as single tweets towards a user followed by a flag (denoting not a conversation but a directed one-way message), should be enough to suspend an account.

What else?

I’m sure there are many other checks possible, but I have to get back to work – so, @delbius, do I get a job offer? Just kidding – was thinking of the guy who got offered a job at YouTube after writing ‘YouTube Instant’.

First impressions on AutoCAD for Mac OS

AutoCAD_BSOD

It should have been premonitory – while looking for other reviews or info on the upcoming AutoCAD for Mac OS release, I stumbled upon this post by Steve Johnson, owner of cad nauseam, in which he details why AutoCAD for Mac would be a bad idea. While I agreed with some of his views, as this has happened before countless times (case example: Skype, which took years to catch up still lacks behind in features and stability compared to its Windows version), I believed things wouldn’t be that bad.

It turns out there is a list of over 80 holes which Autodesk lists here. Steve has posted this in response, and we now even have an interview with Autodesk staff with money quotes such as:

It really does not make sense for us to implement features on the Mac platform that nobody’s going to use. So basically what the customers are asking for is that we are going to deliver. So like I mentioned before Mac users on the Architecture side shouldn’t notice much of a difference.

OK so you release a trimmed product with not-so-oft used features missing, but at the same price? That doesn’t really fly, no matter how you look at it.

San Rafael, we have a problem

My first reaction when I saw the activation window, right after installing AutoCAD on my Mac Pro, was “OMG they have transplanted the Windows version using Java”. It was SO ugly – in essence, a copy-paste of the Windows workflow into a Mac window, and I suspect they load the content as HTML from a server. Scrollbars? On a modal tool window?

This was before I had to activate the product, which required creating a whole new account, as my teacher login details would not work at all. Autodesk apparently “had no record” of my email address and password, so I had to go through the account creation once more, then finish the activation process, which takes a few more, totally unnecessary, steps. A simple “give us your product code and serial” followed by a “thank you for activating” is more than enough.

Once you fire up AutoCAD, you’re greeted by this splash screen:

which is not particularly informative, but still, shows something. The next thing you are greeted with, at least with the educational version, is this:

OK, so I am using an educational product, but you don’t need to keep reminding me of this fact every time I open a drawing done with a different version – a “don’t remind me again” checkbox is all it takes. The warning would be useful if it came up with drawings you made with a full AutoCAD version, given by others, etc. but it also shows up when creating a new drawing from one of the included templates!

Finally, expecting a normal drawing area, I was greeted by this (click for full size):

It almost reminded me of a BSOD. No matter what I did, I could not even get the cursor to appear in the drawing area, never mind actually draw something. The application was completely unusable. Creating a new drawing, from a template, blank – nothing worked, I either got the blue bars of death (BBOD) or a black drawing area into which everything was sucked into, a-la-black hole. No cursor, cross-hairs, nothing.

The next logical step was to open a drawing recently created using AutoCAD on Windows, and this came up:

I give up. I will try to install it on my MacBook Pro, and see what happens. If the problems reappear, I’ll go back to BootCamp or VMWare with the Windows version, which is fully-featured, stable, and usable. Nice and commendable work on bringing back AutoCAD to the Mac, but so far, it appears the bugs and missing features, even when they are fairly unused ones, are killing the product. Again.

Getting MAMP 1.9 to work with Image Magick, imagick.so and other flora

It was a full eight hours of hair pulling. For some reason, all the tutorials that can be found on getting MAMP to work with Image Magick in Snow Leopard are incomplete, miss out information, or dated. Or all of them. They are excellent posts, but I could not get imagick.so to be loaded as a PHP module by following any of them. I won’t go into explaining what MAMP or Image Magick are, if you are reading this, you already know, and most likely are having the same problems I was having.

Here is a short list of the resources I used to write this procedure:

Getting Imagemagick (and more) to work with MAMP on OS X – misses info on compiling for Snow Leopard.

Installing Image Magick and Imagick for PHP for MAMP – misses change needed in ports conf file to enable Universal mode.

MAMP & Imagick on Snow Leopard – goes through the pitfalls, which makes the tutorial confusing, but goes into the Universal mode switch.

There are others which I may miss, such as forum posts or other blogs, if so, my apologies. In all, none of them go into the use of older libraries by MAMP in its sandboxed model, which breaks imagick.so when trying to compile it from source rather than using pecl.

1. Install MacPorts

I won’t go into details as you most likely have already done it if you’re reading this. Don’t update your ports yet!

2. Make MacPorts build Universal binaries

Simply edit /opt/local/etc/macports/variants.conf and add +universal at the end of the file. Now, update your ports collection by running:

sudo port -v selfupdate

3. Install Image Magick using MacPorts

Simple:

sudo port install ImageMagick

This takes a while, so go grab a coffee.

(more…)

Sobre Nikodemo, capital riesgo, y WebTV

En primer lugar, y dado que sé cómo se siente Albert en éstos momentos, darle tódo mi ánimo en su nuevo proyecto, el WebSeries Festival. Por otro lado, no puedo quedarme al margen de la mucha tinta que se ha versado respecto al modelo de WebTV, el capital riesgo, y los emprendedores, tanto para bién como para mal. Yo mismo he experimentado el que te digan “no” en repetidas ocasiones, escuchar que el proyecto no está teniendo “tracción”, o que le faltan cosas. Como última consecuencia, el “no” repetitivo forzó la venta de Whisher en condiciones no demasiado óptimas (por mucho que lo intente maquillar mi ex-socio en su perfil de LinkedIn, aunque ésa es otra historia que no viene al caso).

Me apena decir adiós a series como Cálico Electrónico, que en sus inicios nos hizo contactar con Albert sobre la posibilidad de que nos creasen un video animado de introducción a Whisher – aunque al final no se hizo por cambios sustanciales en nuestra página web. El cierre de Nikodemo & Co. viene forzado por no encontrar financiación que pudiese sostener el proyecto, que todavía tenía resultados económicamente negativos – aunque positivos en cuanto al público y lo social. Albert se queja de la falta de “riesgo” en la ecuación “capital riesgo”, aunque quizás el primer error fue la elección de las fuentes de financiación. El capital riesgo (en adelante, VC, como en los contratos) puro, tal y como se entiende en el mundo de los emprendedores, es desgraciadamente muy escaso en España. Me vienen a la cabeza unos pocos fondos, como Nauta, Debaeque, Adara, o Perennius. Más abundantes son los “business angels”, que son como un VC pero sin un garrote tan gordo para cuando van mal las cosas. Por debajo de aquí tenemos ya a los innumerables fondos, créditos, ayudas, viveros, parques tecnológicos, y pseudo-VCs. Los más preocupantes son éstos últimos, ya que en los primeros casos las cosas están bién claras desde el principio. Cuando accedes a un préstamo tipo NEOTEC, los términos son claros:

La empresa devolverá la ayuda a CDTI según vaya generando cash-flow positivo. Para ello, la empresa se compromete a facilitar a CDTI anualmente las cuentas anuales cerradas. La cuota anual de devolución será de hasta un 20% del cash-flow positivo generado hasta la amortización total del crédito.

Es decir, no corres riesgo. Si la empresa no llega a afianzarse, no tienes que hipotecar o vender la casa e irte a vivir debajo de un puente para devolver préstamos. el CDTI también se blinda un poco en cuanto a su riesgo de esta forma:

CDTI anticipa a la empresa, a la firma del contrato que regula la ayuda NEOTEC, entre el 40 y el 60% de la ayuda aprobada. El resto se entregará a la empresa a la finalización y justificación técnica y económica del proyecto-plan de empresa aprobado.

Si tus cuentas no dan resultado, el CDTI habrá perdido un máximo del 60%, de a su vez el 70% del coste total del proyecto, que es lo que otorgan. Otros tipos de ayudas oficiales se rigen por términos similares, y se convierten en una buena opción de capital semilla. El único problema es el arduo proceso de solicitud y trámite, que en ocasiones, puede alargarse meses, demasiado para una startup. Para solucionar en parte este problema, han aparecido una serie de empresas que se dedican a asesorar a startups en el proceso, a cambio de cuotas mensuales y/o porcentajes del capital conseguido – también otro tema para tratar en otro momento.

(more…)

My PCB business card flashes its LEDs!

Finally, I received the new PCBs from the manufacturer, after the first batch were found to be defective on track continuity (possibly due to too aggressive etching). This is a short video showing how the first one I assembled and programmed works:

Starbucks Spain rolls out free BT OpenZone WiFi

It had to happen, after Starbucks and Swisscom ended their contract a few months ago, WiFi has been missing from Starbucks in Spain. Some stores still have the old routers switched on with the ‘eurospot’ SSID, I guess it will take some time to get them all replaced.

Grabbing a coffee I noticed something new on the receipt:

Starbucks WiFi

At last, WiFi at my local Starbucks! It’s a shame that it comes a few months late, but welcome nevertheless. It seems that if you own a VIPS card you get double time, up to 90 minutes. The launch was confirmed by BT via Twitter (nice to see they are on top of things!). All they need now is some nice PR material at the stores to show people that WiFi is available, and how to get online.

[Update] I tested the connection on my iPhone today, and there are a few things that need fixing:

- The WISPr code is not fully recognized by the iPhone, and thus you are shown the hotspot’s default landing page.

- Once you have the landing page, you need to tap through to ‘other operators’ so that you can login with the provided BT credentials.

- The BT login page is not mobile-formatted, which makes it a pain to navigate in order to login, even on the iPhone.

- There should be only a password for the free session, having such complex username/password combination is going to put some people off (“where’s the damn forward slash on this phone?!”).

Ideally, BT OpenZone should recognize mobile Safari, and present either a formatted landing page, or suitable WISPr code for the iPhone’s built-in authentication software to kick in. Otherwise, the WiFi connectivity is superb!

Apple, please give the Magic Mouse new gestures

YMMV, but I’m very happy with Apple’s Magic Mouse – the absence of a scroll wheel or button is bliss, and the ergonomics, while not as good as some mice by Logitech, are quite good. What I do miss however is the third button, and having other commands via alternative gestures. TUAW even posted an article earlier calling the Magic Mouse a “dog”, in part, due to the lack of a middle button.

Thanks to ifixit.com, we can appreciate that the underside of the Magic Mouse sports a grid of 10 x 13 sensor pads from the logo towards the top of the mouse, and two final rows at the very top, one with 8 sensors and the other with 6:

Magic Mouse sensors

This gives the mouse the potential to detect one or two fingers placed on it, and the individual motion of each finger, anywhere on the mouse’s usable surface. So, I propose the following gestures. To middle-click (third button), simply place two fingers on the mouse, and click with both at the same time, thus:

Middle click

To open Exposé or Spaces, place the right finger on the mouse, and scroll up or down with the left finger:

Expose

It’s actually quite ergonomic, try it on your mouse, and if you like it, send the suggestion to Apple – if anyone finds a form specific to the Magic Mouse, let me know, otherwise the link points to iMac feedback (as it comes with one). And yes, the images above are taken from Apple’s page and badly photoshopped, my apologies!

Google GPS? Not so fast!

So Erick Schonfeld took a shot at the iPhone maps app, which uses Google Maps as its data source, and all other car-mount GPS manufacturers such as TomTom or Garmin, saying that Google should make Apple beg for maps navigation. I don’t agree with much of his post, here is why:

  1. Real-time navigation availability depends on the type of license map data is served under, as I explained in a post a few months ago. The map data served by Google to Apple for use on the iPhone does not allow real-time, turn-by-turn navigation, thus, it is cheap and much less money flows from Apple to Google for it. This is explicitly referenced in the iPhone SDK’s licensing terms. Google must be paying a premium on the data it serves on the Android GPS app for this kind of use.
  2. A real-time navigation system depends on constant availability of maps, which means online devices, such as an Android phone running Google’s app, must have perfect wireless coverage, in terms of both connectivity and bandwidth, and we know this is next to impossible. A comment on Erick’s post suggests Google caches map data when the route is created, which would be fine…if people followed the route perfectly. Many times, this is impossible for a number of reasons, such as bad routing, roadworks, or heavy traffic. All of these require re-routing, so Google, and any online system, would need to cache also every possible deviation and re-routing from the original path, which is impossible. There is a reason why TomTom’s iPhone app comes loaded with several hundred megabytes of map data.
  3. The GPS chipset on mobile devices is not well-suited for high-rate position updates. This is evident if you use TomTom’s iPhone app, and is also evident as TomTom includes a separate GPS chipset in their iPhone car kit, for “…the most accurate positioning“. Since position update rate means battery consumption, and a phone has a ton of battery-consuming electronics on its own, the GPS typically provides less frequent updates than a dedicated GPS device.
  4. Dedicated GPS units are best at taking you from A to B, re-routing you within a couple of seconds if you deviate, and showing you the location of speed traps safety cameras and other points of interest (POI). As you go up the price ladder, you are provided with additional functionality, such as voice commands, phone connectivity for hands-free audio and real-time traffic data. On this particular point, I totally agree with Matt Burns on his CrunchGear post, who says of GPS makers: “They are in the habit of producing 78 different versions of the same GPS. Each model steps you up $20 and adds another feature“. But I digress. With such a model, of charging for map updates, or for safety cameras, would they not also be charging for POI data if it was of any real use in vehicle navigation? Like updates to the “Restaurants” category? No, the issue here is that POIs are the least used feature in GPS navigators, and the makers know this. You may occasionally look for the nearest gas station, but that’s about it. If you want to eat something, you will ask around at your destination, or will have looked up options before the trip, but very very rarely do people go looking for stuff on their GPS devices. It’s true that Google makes it a lot easier to access this kind of information, and puts it right there on your face, but nothing will beat a dedicated service such as Yelp, or a dedicated app such as Bliquo (shameless plug for my good friend David Douek, who works there now, hope it helps your SEO at tiny bit!).
  5. You can pick up a dedicated GPS unit for almost what you will spend on car mounts and cig-lighter adapter cables. They have faster routing, better planning capabilities, no need for wireless connectivity, and a much better audio output than any mobile phone.
  6. You are supposed to be looking at the road while the GPS guides you by voice instructions, not at the GPS screen while it provides you with fancy data and/or graphics. Once you safely stop to look at the GPS, there are much better ways to present useful data, such as POIs, than Google’s interface. Many countries are looking into forcing GPS manufacturers into blanking the screen while the vehicle is moving in order to further prevent distractions to the driver.
  7. TomTom, as an example, can add natural voice route requests to their higher-end units via software updates. Some already feature dictated destination input, but its use is clunky and not very useful right now – I bet we will see improvements soon. All it takes is the licensing of a proper speech-recognition engine. Google doesn’t have any major competitive advantage here, other than being the first to implement an (allegedly pending actual reviews) good functionality.
  8. TomTom owns Tele Atlas, and Nokia owns NAVTEQ, which combined provide a huge chunk of the map data used by Google Maps. I love you Fake Steve, but you’re wrong on this one – GPS makers are fine, and they know it. Unless Google is planning on re-creating all the map data on their own of course, which is discussed extensively on this post by James Fee, but this would only mean Google would be free from other providers, not crush them.
  9. Erick argues that “…the future of mobile apps are Web apps”. I think this is a huge over-simplification – the future of some mobile apps are apps that pull some or all of their data from the web. I regularly use an iPhone app that provides emergency response information on hazardous material (HazMat) incidents – I would be screwed if I had to depend on cellular coverage and a web service for this! We all saw how long Apple’s hard stance on iPhone web apps lasted, and the App Store just broke the 100.000 approved app barrier, so I rest my case.
  10. Further from the GPS-centric topic, I’ll question wether Google really developed the Mail and Search functionalities of the iPhone – AFAIK, these are implementations of Mail and Spotlight respectively, can anyone confirm this one?

Wi-Fi Direct explained for those who think it is ad-hoc mode revisited

While it does contain most of the ad-hoc stack, the recently announced Wi-Fi Direct standard is actually an attempt to become more like Bluetooth. Ever since Wi-Fi was invented, ad-hoc mode allowed two or more adapters to form a peer-to-peer network without an access point (AP) running the show. In certain scenarios, there would be connectivity problems when adapters were not configured for automatic IP assignment in the auto-discovery range, or had static IPs setup. Saving these, the user would then have to make sure his operating system had enabled the appropriate sharing protocols so that meaningful things could happen, such as sending files from one machine to the other.

Ever since Bluetooth was invented, it provided a communications stack, and a protocol stack, which encompassed a growing number of profiles. An application only had to talk to the right profile in order to establish communication with another Bluetooth device supporting the same profile, for example, serial port, audio gateway or FTP. During device discovery, a Bluetooth device would query the other about its available profiles, and would then choose the right one as needed. As a practical and recent example, the iPhone initially supported the handset profile, which provides very rudimentary headset functionality and leaves out things like address book access. Over time, the iPhone has been upgraded with more profiles, some as complex as A2DP which allows highish-definition audio to be sent to stereo headphones or speakers. I say “highish” as it uses an audio bandwidth of 16kHz, way below the normal audio response of a set of headphones, leading to a noticeable decrease in quality. But I digress.

In my view, the press release was very badly worded, making it appear as a re-branding of old-time ad-hoc, when it really implies adding a number of protocol stacks and profiles to the standard ad-hoc mode. It is also an attempt to take Bluetooth head-on, with the argument that Wi-Fi is a gateway to a bigger number of services – Bluetooth DSL router, anyone? no? I rest my case. Having such a set of profiles would obviate the need to have Bluetooth chipsets on top of Wi-Fi, which are always an added cost and source of radio interference. We would then have to see how audio accessories cope with this, but then again, I’ve not seen many people carrying Bluetooth headsets around, while car kits can accommodate a Wi-Fi chipset thanks to their board space and bigger battery.

Hey everyone, faking a USB ID is not illegal, you know?

I read with interest the many articles being written around the USB-IF’s decision to give its blessing to Apple’s use of the USB vendor ID, and claim that Palm’s usage of Apple’s Vendor ID in the Pre violates its policy. Now let’s sit back for a minute, and consider what the USB-IF actually is.

The Implementer’s Forum, as it is know, is made up from various companies that helped develop the USB standard and its newer, faster derivatives. The USB-IF acts as a central clearinghouse that provides USB vendor IDs to manufacturers who wish to use USB ports in their products. Every vendor using USB is supposed to register on this forum and pay its fees, which then gives them the right to use the USB logo on their products, and an individual vendor ID, which combined with a product ID, identifies every device on a USB bus.

In theory, this is sweet and dandy, but in the real world, shit happens. Anyone who has played with hardware peripherals long enough will have seen at least once a device identified by Windows as something else – this happens when a vendor “clones” another vendor’s ID. Some can get away with using the other vendor’s ID and a random product ID, combined with a customized driver on CD. In fact, there are tons of products shipping today which bear the USB logo without paying any duties to the USB-IF, and thus, running with “pirated” IDs.

The only power the USB-IF has is self-regulation. If you want to bear their logo, you need to pay their royalty, and agree to abide to its policies, including non-cloning of vendor IDs. So let’s say Palm gets booted off the USB-IF. They just need to remove the USB logo from their product (if they bear it at all – check your iPhone as an example), and they’re home free. They are free to use Apple’s vendor ID as much as they want, and there is no legal recourse Apple or the USB-IF has. With so much legal power, don’t you think Apple would have sued Palm already if there were grounds for legal action? Rather, they engaged in a technical cat-and-mouse game involving iTunes updates to kill off the attacker.

Personally, I think Palm is in delusion. Making the Pre compatible with iTunes will not make it any more popular that it already is not. And Apple has every right to place technical blocks on the Pre, particularly if they miss-represent the vendor ID. Still, if I was Apple, I would have just ignored the issue. The Pre is not a threat to the iPhone, which is far superior in all aspects (apart from the non-removable battery).

Page 2 of 21«12345»1020...Last »