1.74 |
April 30, 2018 |
Fix: can now access TLS1.2 secure sites
Rebuilt in .NET4
|
1.73 |
July 7, 2015 |
PDF: read xref correctly
|
1.72 |
February 26, 2014 |
Email: Set send email credentials better
|
1.71 |
April 10, 2012 |
Search API: Fields now getting through again
Indexing: Cope with repeated Location headers
|
1.70 |
July 21, 2011 |
Indexing: PDF: Cope with unexpected name format
|
1.69 |
May 15, 2011 |
Indexing: Fix minor bug indexing PDFs
|
1.68 |
March 2, 2010 |
Indexing: Parse DOCX better so no broken words
Web service: Correctly remove High and Low surrogate characters from returned XML
Search databases: check every hour for updates and reload if necessary
- see web farm information.
Templates: nowrap removed from header template
Indexing: Remove port when creating an automatic search database
Indexing: For directory scans, cope if directory inaccessible
|
1.67 |
August 20, 2009 |
Highlighting: Remove various "if-" headers that cause highlighting to fail: "304 not modified" returned
Highlighting: "ShowCredentials" parameter supported in Web.Config
Indexing: Credentials now Negotiate" to support Kerberos
Dynamic database searching: don't show %DB_CREATION_DATE% if no main database loaded
Languages: 13 European languages added to user interface
Highlighting: works for Greek and Bulgarian text
Indexing: now uses latest version of ICSharpCode.SharpZipLib for unzipping new office files
|
1.66 |
April 3, 2009 |
Indexing: Cope with Content-Location HTTP header that refers to the current URL
|
1.65 |
March 9, 2009 |
Indexing: PDF: Output anchor "page=n" for pages 1 to 31.
Indexing: Check for cancel during directory find all files.
About: better statistics list
Log: log IP address and Robot name
Log: log pages highlighted
Log: XML encode message
About: Keep count of robot searches separately - more info
Template: by default has meta robots nofollow and noindex
|
1.64 |
January 8, 2009 |
Indexing: PDF: Cope with unexpected too-large integer number
|
1.63 |
November 14, 2008 |
Indexing: PDF 1.5 format finally supported: cross reference streams and object streams
Indexing: PDF Flate DecodeParms Predictor 12/Up supported
Indexing: PDF \r\r line ends recognised
Indexing: Content-Type HTTP header used to override file type
Indexing: Use moved location of initial URL, eg follow "findinsite" to "findinsite/"
Indexing: Use "Content-Location" to reduce duplicate URLs indexed, eg "findinsite/" is the same as "findinsite/default.htm"
Indexing: Rules added
Highlighting: base tag and (changed) header added at better position in web page
Indexing email: server port added
Search: Load database files more efficiently
Templates: Consistently use %SEARCH_TEXT%, though %SEARCHTEXT% still supported
Output and templates: updated to use better XHTML
Output: default target supported
|
1.62 |
July 4, 2008 |
Indexing PDF: Finds endstream better
Highlighting: Fixed bug highlighting URL with non-standard characters
Search API: Snippet has search words highlighted using a SPAN with class hilite
Indexing: FieldsToExclude advanced option added
|
1.61 |
April 17, 2008 |
Indexing: Credentials now supports Integrated Windows Authentication
Indexing: Fixed bug when removing indexing from completed list
|
1.60 |
November 22, 2007 |
Compiled to run in ASP.NET 2.0+ web site
Search: dynamic database searching supported
Search: Highlighting of search words in results fixed for multiple subsets
Search: Field searches fixed for multiple subsets
Look and Feel: %DYNAMIC_DB% supported in header and footer
Config: Searching section new option added "Dynamic database searching regular expression"
Indexing: Cope with unusual BASE tag values
Indexing: Cope with Moved Location even better
Indexing: HardExclude advanced option added
Startup: reallySetLanguages exception handled
Indexing: PDF: Cope with format variant
Emails: sent using ASP.NET 2.0+ method
Config: Indexing From Email Password is a 'password' type input field
Search: cope with bad URL parameters better
|
1.51 |
May 8, 2007 |
Indexing: XLS and PPT: TextExtractor call bug fixed
Indexing: PDF and XLS: Floating point numbers identified correctly on non-English computers
|
1.50 |
December 18, 2006 |
Indexing: Algorithm changed to reduce memory requirement
Indexing: HTML: Cope with just 'text/html' and 'text-html' charsets
Indexing: PDF: indexing speed-ups
Indexing: PDF: Only report unrecognised encoding /Identity-H if PDF_ReportCharacterDecodeProblems set
Indexing: PDF: UnicodeEncoding bug fix
Indexing: Image: Find XMP (Extensible Metadata Platform) meta-data, eg Vista Tags
Indexing: Cope with (ie ignore) read errors
Indexing: Cope with include/exclude/robots after HTTP redirect
Indexing: Robots not case significant
Indexing: Pause every 100 files for 0.1 second
Indexing: Don't write fields or anchors if file not being indexed
Control Panel: Memory, searches and indexings counts since restart listed on About page
|
1.21 |
November 22, 2006 |
Indexing: Word 2007 DOCX/DOCM files supported
Indexing: Excel 2007 XLSX/XLSM files supported
Indexing: Powerpoint 2007 PPTX/PPTM files supported
|
1.20 |
September 21, 2006 |
Indexing: Ignore <?xml...> in web pages
Indexing: BASE tag supported
Config: Load template files in UTF-8
Highlight: Find charset more flexibly
Highlight: Fix bug if search word found in header
Language: Thai language supported
Search: Fix bug if space searched for
|
1.19 |
July 4, 2006 |
Indexing: Excel XLS file indexing - minor improvements
Indexing: Sections of web pages can be excluded using GoogleOn/Off and FindinSiteOn/Off comments
Indexing: URL recursion stopped using MaxURLLength, with default of 1024.
Look and Feel: displayError template supported in finderror.htt - More...
General: FindinSite image returned accurately
|
1.18 |
March 23, 2006 |
Indexing: Excel XLS file indexing and searching supported
|
1.17 |
February 13, 2006 |
Language: "Languages to Use" option added to Look and feel Control Panel
Language: Language and text direction forced to English for config page heading
Email: SMTP Mail Host option provided on Indexing config page
Email: SMTP send basic authentication password support (can be stored in Web.Config appSettings)
|
1.16 |
October 28, 2005 |
Indexing: Publisher PUB file indexing now supported
Language: Norwegian language file added
General: Logo and web site change and rename to findinsite-ms
General: Bug fix: Disregard include in template variable substitutions
General: Improved results sorting
|
1.15 |
July 29, 2005 |
Language: Bug fix: non-Western characters identified correctly
|
1.14 |
July 28, 2005 |
Language: Arabic (العربية) user interface added (thanks to Lubna Sorour)
Language: Arabic words now delimited by spaces etc
Language: Arabic character versions handled better (ا ى ه و)
Language: Arabic 'the' (ال) at start of word handled correctly
Language: Arabic search for 'the' by itself ignored
Language: Language files now assumed to be in UTF-8
Language: Right-to-left (RTL) languages supported using %L_HTML_TAG%, %L_BODY_TAG% and %L_ALIGN_TAG% strings in templates
Language: findinsite-ms version date localised
Language: Slovenian (Slovenščina) user interface added (thanks to Luka Malenšek)
Indexing: If Content-Type HTTP header specifies HTML charset, use this and ignore META charset.
Indexing: Try to determine HTML charset from META charset before main parse.
Highlighting: Bug fix: pages starting with UTF-8 marker bytes incorrectly recognised
|
1.13 |
June 29, 2005 |
Output: Extra linefeeds removed from around Included file content
Output: Included files only sent form data if included file is an .aspx
Highlight: "highlighted by" footer removed because it was not shown in the correct position by FireFox on some sites
Installation: bin dll library files renamed with phdcc.fis. prefix - be careful to delete old DLLs before installing new ones
Search API: Remaining result line variables made available
|
1.12 |
May 27, 2005 |
Search API: Highlight URL returned now works with FireFox
Indexing: First suggested filename doesn't have 1 appended
Indexing: Results email includes URL, File or Directory
Indexing: Search db description not saved if indexing run edited
Indexing: Report better error if image file has zero length
Search: Bug fix: crash if search db not loaded successfully
Search: Remove ? from end of search if question asked, ie if more than 1 word
Config: cope better if existing search db corrupted
Config: better on-page JavaScript handling for create new indexing
Output: Site(s) being searched added to default template using %L_SITE% and %SITES%
|
1.11 |
May 19, 2005 |
Config: Very first control panel has easy option to make index and search
Indexing: For charset "text/html;" assume ISO 8859-1
Indexing: Unrecognised robots tags ignored
Indexing: redirect out of directory handled better
Indexing: .php added to default HTML file types
Highlight: content-type checked better, so aspx pages work
Highlight: works for sites that use Transfer-Encoding in response header
Search: cope with apostrophes better
|
1.10 |
April 15, 2005 |
Indexing: Username/password supported using new
Credentials
advanced option (basic/digest credentials supported)
Output: Various speed ups
Output: %L_APPNAME% not made HTML-safe
|
1.9 |
April 14, 2005 |
Indexing: PDF and TXT indexing speed increased
Indexing: Abort mid-file implemented
Indexing: Bug fixed: slowness if
AbstractWords set to 0
Indexing: Redirections off-site not reported as errors
Indexing: Minor DOC parsing fixes
|
1.8 |
April 2, 2005 |
Indexing: Bug fixed: page redirection timeout
|
1.7 |
April 1, 2005 |
Indexing: Bug fixed: page redirection
|
1.6 |
April 1, 2005 |
Output: Results list has snippet excerpts from each page, with search words highlighted
Output: Default template redesign
Output: Styles used in many generated HTML elements
Output: New results variables supported: file size, date, date-indexed, word-count, etc
Output: More output dates localised
Output: New language file strings supported
Indexing: More information stored for each indexed file
Indexing: If file fails
Include or
Exclude
then it is still spidered and links followed
Indexing:
UserAgent and
ObeyRobots
advanced options added
Indexing: <br> not added to abstract at line breaks
Indexing: web errors made more concise: no stack trace
Indexing: AbstractWords
now defaults to 0, ie abstract not obtained from first words of file
Highlight: Bug fixed: highlight fails for search of *
Highlight: Copes with bad HTML better
Config: Bug fixed: Pages now counted correctly when db removed
|
1.5 (5.4) |
March 3, 2005 |
Indexing: Crawl-Delay throttle implemented for robots.txt
|
1.4 (5.4) |
February 22, 2005 |
Indexing: Page redirect works better
Highlight: Bug fixed: does not pass on "accept-encoding" header
Output: Last run output for indexing in progress has better message
Output: Default result logo updated
Licensing: All starts logged at phdcc.com
|
1.3 (5.4) |
February 10, 2005 |
Indexing: robots.txt supported
Indexing: Cookies maintained throughout each indexing run, saving session state
System: Fix initialise security exception on some shared hosts
|
1.2 (5.4) |
February 2, 2005 |
Highlight: Highlight of hits in HTML pages; highlight configuration options added
API: Search API updated to add HighlightURL to each returned result
API: Search API bug fixed: GetFieldNames() causes exception if no fields available
Indexing: FindInSiteBot user-agent HTTP header added to indexer, referring to robots bot page,
Indexing: Various PDF indexing fixes
Indexing: REL="nofollow" supported in A tags
|
1.1 (5.4) |
January 4, 2005 |
Release
|