lua-users home
lua-l archive

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


I am attempting to capture a series of numbers (all dollar amounts).  I have
experimented with multiple patterns to no avail.  I created this small
program to demonstrate my problem.

I am reading PDF reports and pulling dollar amounts and a description.  My
pattern is catching periods (used to indicate abbreviations) in the
description.  I am unable to figure out how to build a pattern to do this.
Here's the layout:

Description  0.00  3,587.46 (125,000.00)

This description may contain parentheses or periods.  The dollar figures are
in US standard accounting format where numbers enclosed in parentheses mean
negative values.  The program contains sample data and you can see the

  function extract(string1)
--   culls the string at the first digit in the report--
     local s, e = string.find(string1, "[%(]*%d+")
     if s == nil then return nil end
     local item = string.sub(string1, 1, s - 1)
     return item
  function trim(s) 
    return (string.gsub(s, "^%s*(.-)%s*$","%1"))    
--  10/29/07 handle negatives in pattern. DOS uses "-" at end of number
  local record = nil
  local i = 0
  local nums = {}
  local line_data = {}
  local pattern = "[%-%(%$]*[%d%,]*[%d%.%d%d]+[%)%-]*"
  print("debug13- Version 1.0  1/07/08")
  print("debug13- Build CSV data from PDF listing of spreadsheet.", "\n\n")

  line_data[1] = "This is good data  4.00  3.99 (1,768.50)"
  line_data[2] = "Data- In a line  5.00 (100,000.00) 957,123.45"
  line_data[3] = "Repairs- () (Wages)  123,456.99  28,123.45 650.00"
  line_data[4] = "Repairs- Ex. Wages  50,120.00 500.00 1,000.00"
  line_data[5] = "A bunch of negatives (123,456.89) (123,456.90)
  for i = 1, #line_data do
       for num in string.gmatch(line_data[i], pattern) do           
            nums[#nums + 1] = num                               
       end --do for
	   print(#nums, " <=== Number of captured numbers")
       if #nums == 3 then
            descr = trim(extract(line_data[i]))       
            print(descr, nums[1], nums[2], nums[3])
       end --if 
	   if #nums == 4 then
            descr = trim(extract(line_data[i]))       
            print(descr, nums[1], nums[2], nums[3], nums[4])

       end --if 
       nums = {}
       print("----------------Loop Separator-------------")
  end --do  
  print("debug13- End of execution")

CONFIDENTIALITY NOTICE:  This E-mail message and all attachments, which originated from Sealy Management Company Inc, are intended solely for the use of the intended recipient or entity and may contain legally privileged and confidential information.  If the reader of this message is not the intended recipient, you are hereby notified that any reading, disclosure, dissemination, distribution, copying or other use of this message is strictly prohibited.  If you have received this message in error, please notify the sender of the message immediately and delete this message and all attachments, including all copies or backups thereof, from your system.  You may also reach us by phone at 205-391-6000.  Thank you.