dataThief What is dataThief dataThief is a program to reverse engineer a set of data from a given plot in a magazine or journal. This program gives you the opportunity to incorporate somebody else's data points in your plots. This comes in very handy when f.i. you would like to compare your data with the data in a published article for which you don't have the data in table format. You need access to an Apple Scanner with the HyperScan HyperCard Stack, or some other scanner/program combination that lets you scan an image of a plot and save it as a PICT format file. How does it work ¥ Scan the plot as good as possible. ¥ Save the image as a PICT format file. Because dataThief scans the bitmap, choose the PICT image large enough to preserve the original resolution (but not more). ¥ Open dataThief and select the file to be processed. ¥ Don't be afraid if the plot appears rotated and/or skewed on the screen. dataThief will correct for it, using the axes youÕll define. ¥ Now move the mouse pointer to the leftmost point of a horizontal axis in your plot and click the mouse. A dialog box will show, in which you can fill in the real X coordinate of this axis point. ¥ Then move the mouse pointer to the rightmost position of the horizontal axis and click the mouse. You will get another dialog box; fill in the appropriate value. Take care having the same Y-value as for the former X-point as this is used correcting the skewness! ¥ Do the same for two points on a vertical axis (with the same x-coordinate!). ¥ You will now be prompted for an optional correction in the axes real coordinates and additional options; e.g. error bars, logaritmic axes & markers to make your datapoint-clicking visible. ¥ Move the mouse pointer to the first data point you want to steal and click the mouse. ¥ If you have selected error bars, click on either the upper or the lower extremity of the error bar, then click on the remaining extremity. Repeat this for all data points. ¥ You also can perform an automatic scan. Click the starting point of the scan, click the end border and watch the scan going. In this case you can set several variables to make the scan fit your wishes. Try it out and find values particularly suitable for your scan. ¥ When you have clicked all data points, select Save or Close from the File menu. This will save your data points (tab delimited) in a TEXT file which can be read by any word processor, spreadsheet or plotting program. If you like, you can set the 'creator'. This enables your favorite program for further processing to open the data file automagically. In the latter case, you probably want to skip explaining text in the output file (see options). Resolution The accuracy of the derived data can never be greater than the resolution of the original plot allows. Be aware of that and be critical with your scanned data. One should also realize that derivation of a coordinate is done from a bitmap. So, if your scan allows it, make your PICT large enough not loosing accuracy (a screen has a typical resolution of 72 dpi). In dataThief, if your PICT is smaller as the screen, the plot will be blown up to the screen size, as far as the ratio allows. If the PICT dimensions are larger than the screen, the offscreen bitmap will have that size too, and you wil only see a part of it on the screen. Just scroll through the picture, setting the axes and taking your data. In this way youÕll get the maximum resolution possible, but be aware not to suggest more accuracy than there is intrinsic in your scan! dataThief Menus File Menu Open This opens a PICT type file and displays it on the screen. Close This closes the present window. If you have stolen data, you will be asked wether you would like to save it. Save This saves the stolen data in a file with default name 'input file name'_data. Only if you have scanned data and didnÕt save yet. Save As This saves the stolen data in a file with default name 'input file name'_data. Quit This closes the present window and exits dataThief. If you have stolen data not saved yet, you will be asked wether you would like to save it. Edit Menu Undo Hit this repeatedly to undo clicks on either data or error bars. Cut, Copy, Paste and Clear arenÕt supported within dataThief. Options Menu Auto Tracing This option toggles between auto tracing on and off. With tracing on, click on the first point of a curve, click an end value and watch it GoooooÉ Coordinates This option toggles on and off a floating window containing the coordinates of the current point the cursor is pointing at and the coordinates of the last clicked or traced point. Sound This option toggles sound on and off. Use it if our "fun" sounds make you sickÉ Zap Zap Data All clicked or auto_scanned data are gone! You can use this to when scanning a multiple curve plot; first scan the first curve, save the data, zap the data and then scan the second curve, etcetera. Zap Axes Use this to remove all entered values for the X and Y axes when you have made a mistake. This includes zapping of the data too! Trace Specs In this dialog you can set some values that change the behaviour of the auto tracing. Trace width enables you to set the maximum horizontal gap in pixels the tracer will jump over without stopping the trace. Trace height sets the number of pixels the tracer will look up and down a data point to find the slope of the curve. If you increase this number, more pixels will be looked at; i.e. you can trace a steeper slope change. The default values are the ones we found to work best in most cases. dataThief uses the slope of the two former scanned points as a startpoint for the search. Data Options In this dialog you can change some of the behaviour of dataThief. It contains one checkbox and three radio buttons. The checkbox Enable Markers does just that. When selected, dataThief will show markers on the spots you've clicked on. The markers are a circle for data points and a square box for error bar points. Of course this only works in manual data taking. The three radio buttons control the behaviour of the way error bars are handled: ¥ No error bars means the plot has no error bars. ¥ Asymetric error bars are error bars which are not of equal length. In the output file you will get Ñapart from X and Y valuesÑ a DY+ and a DY- value. ¥ Symetric error bars have an equal length on both sides of the data point. In the output file you get just a DY value (= mean value of DY+ and DY-). When either asymetric or symetric error bars are selected, you don't have to worry about which point to click first. dataThief is smart enough to know which of the two error point has the highest value; look at the cursor shape to see if an error- or datapoint is expected. File Options In this dialog you can set the creator of the output file. The default is dief, the dataThief file format. If you change this to e.g. ÔQKPTÕ, the file will be owned by KaleidaGraph, and can be opened by double-clicking it. A list of useful programs is provided. Another option here is to skip the headers. When chosen, this will skip all text in the output file, i.e.only the numeric data wille be written into the output file. The last option here is compression factor. When you change this to n > 1, the output file will contain only every n-th scanned point. This only refers to auto-traced data and doesnÕt affect any clicked data in single-click mode! Save preferences Saves the current settings of all options. Defaults Default values for all kinds of options are stored in ÔdataThief preferencesÕ on the appropriate location. Although they can be changed and saved from within the program, you can look for them for them with e.g. ResEdit; a template is provided. Only do this if you are familiar with resource editing, in fact there is no need for this. Also in the preference file a STR# resource (8000) describes the list of available output creators. You may want to customize this list. Each creator has two strings: the actual creator (type TNAM/OSType, 4 chars) and a string describing its program. Allways add/remove an even number of strings! If the resource is missing, dataThief will create a new one. Availability dataThief is freeware: the program was developed as an aid to the physicists at our laboraty and as a exercise in Mac programming to us. It was decided to let it roam the world for free as a help to all who need to steal data from others :-) However, it is not public domain: this program is ©1991-1994 by Kees Huyser & Jan van der Laan of the Computer Systems Group of the Nuclear Physics Section at the National Institute for Nuclear Physics and High Energy Physics (NIKHEF-K), PO Box 4395 1009 AJ Amsterdam The Netherlands keeshu@nikhefk.nikhef.nl Ê +31 20 592 2032 jan@nikhefk.nikhef.nl Ê +31 20 592 2031 The program is available by email from the authors and by anonynous ftp from SUMEX (and its mirrors) and sun4nl.nl. This version is also posted to comp.binaries.mac. If this program is of any use to you, we'd appreciate a picture postcard from your home town. Version 2 is beerware: if we ever meet, youÕll buy us a beer. We will do the same when we meet Troy Gaul or James W. Walker. They provided Infinity Windoid and the Help dialog source respectively, both used in this program. Legal stuff You are encouraged to share this program with your friends and collegues. Distribution of dataThief on any media, by any organization or individual, by itself or as part of any collection or group of software, must be provided free of charge to all users, outside of normal media or connect charges. If you wish to have dataThief shipped as part of a package or collection, you must receive the author's approval in writing before distributing any copies. Send all inquiries to the address given in this help text. The program is written in THINK C version 6.0.1, therefore parts of this program are © 1994 by Symantec Corp. Disclaimer dataThief is strictly provided as-is, with no warranty to its suitability for use in any circumstance. I cannot be held responsible for bugs, incompatibilities, or any other problems you may find with this software. You assume total risk for using this package. However, dataThief has been tested on a variety of machine configurations from the Mac Plus to the Centris 650, running System 4.2, 6.0.5-6.0.8, 7.0, and 7.1 with various extensions and hardware combinations with no known problems. Version History If you find any bugs in this program, please write or E-mail us at the above address. Versions in between the ones mentioned were only for internal (ab)use. 2.0b (finally found somewhere some time, 05/22/94) ¥ Renewed the help dialog. ¥ No longer import of MacPaint format. ¥ Option are now saved in a preference file. ¥ Chosen axes can be drawn now, helpful in case of skew plots especially. ¥ Cut-off boundary in autoscan. ¥ Added color where useful. ¥ Resizable window, and scrolling. ¥ Window size suited to the PICT size, but minimal screensize. ¥ A critical look over the code; found a lot of minor bugs almost nobody noticed. 1.0.8 (07/05/92) ¥ Fixed opening datafile under system 7 while quicktime installed. ¥ Refinement updating floating windows and markers. 1.0.5 - 1.0.7 (92) ¥ Not documented and therefore forgotten bug fixes. ¥ Floating window with coordinates. ¥ System 7 compatible and 32bit clean. ¥ Included quicktime preview. 1.0.4 (05/26/91) ¥ Fixed a bug while closing windows. 1.0.3 (03/24/91) ¥ Import of PICT format. ¥ A lot of improvements (we think) in the user interface. ¥ Improvement of autotracing. 1.0.1§ (02/24/91) ¥ Fixed a nasty bug while saving the stolen data. ¥ Some attention to the user interface. ¥ Correction for skew pictures, i.e. data are corrected if the scan data are rotated. ¥ Added an autotrace option. 0.9§ (somewhere in 1990) ¥ Preliminary buggy release. Send to an interested user who put it on SUMEX and posted it to comp.binaries.mac. We regret this buggy version being distributed, albeit we can imagine. Things to do ¥ whatever will be asked for and seems to be valuable (provided we find the time to do it). Things not to do ¥ No fitting of data. Use a fit program to do this. ¥ No calculation of polynomials. Use Igor (highly recommended) or such. ¥ No editing of stolen data. Use a text editor. ¥ NoÉ whatever makes dataThief into a Mac version of the PAW (Physicists Analysis Workbench). ¥ Don't be caught with you hand in the cookie jar.